Recursive type-IV discrete cosine transform system

ABSTRACT

A recursive type-IV discrete cosine transform system includes a first permutation device, a recursive type-III discrete cosine/sine transform device, a cosine/sine factor generation device, a recursive type-II discrete cosine/sine transform device, a second permutation device. The first permutation device performs two-dimensional order permutation operation on N digital signals for generating N two-dimensional first temporal signals. The recursive type-III discrete cosine/sine transform device repeats a type-III discrete cosine/sine transform for generating second temporal signals. The cosine/sine factor generation device sequentially performs cosine/sine factor multiplication and corresponding addition operations for generating third temporal signals. The recursive type-II discrete cosine/sine transform device repeats a type-II discrete cosine/sine transform for generating fourth temporal signals. The second permutation device performs a one-dimensional order permutation operation for generating N one-dimensional output signals. The N one-dimensional output signals are obtained by performing a type-IV discrete cosine transform on the N digital input signals.

CROSS REFERENCE TO RELATED APPLICATION

This application claims the benefits of the Taiwan Patent ApplicationSerial Number 101100102, filed on Jan. 2, 2012, the subject matter ofwhich is incorporated herein by reference.

BACKGROUND OF THE INVENTION

1. Field of the Invention

The present invention relates to the technical field of digital signalprocessing and, more particularly, to a recursive type-IV discretecosine transform (DCT) system.

2. Description of Related Art

With the development of digital signal processing technologies, variousmessages and media information can be conveniently obtained in dailyliving. A variety of modified discrete cosine transforms (MDCTs) andinverse modified discrete cosine transforms (IMDCTs) are widely used invarious audio codec standards.

The audio codec standards include MP3, AAC, AC-3, TwinVQ, Ogg, forexample. The MDCT and IMDCT operations in an audio codec occupy a verylarge portion of the entire operational complexity. If the MDCT/IMDCT isimplemented with a same approach, sharing the hardware can be achievedon design for reducing the hardware requirement and the MDCT/IMDCToperational complexity.

For a high-efficiency Advanced Audio Coding (HE-AAC) audio codec, ituses high-quality spectral band replication (HQ-SBR) or low-powerspectral band replication (LP-SBR) technologies in which complex-domainanalysis quadrature mirror filter-banks (complex AQMFs) and synthesisquadrature mirror filter-banks (complex SQMFs) can be derived as the DCTof type III (DCT-III) and DCT-II kernel methods. Therefore, forimplementing an aspect of AQMF and SQMF co-architecture in design, inaddition to the MDCT and IMDCT computation, the operation ofDCT-IV/DCT-III/DCT-II supports is accounted an essential key in hardwaredesign.

However, the typical recursive architecture for IMDCT implementationshas the disadvantages of having numerous operational periods and theovertime computation and being difficult to implement a co-architecturedesign for different operations such as the MDCT, AQMF at a decoder andthe SQMF at an encoder. When the typical recursive architecture requiresincreasing the bit rate, only the hardware or the timing can beincreased. However, the increased hardware indicates to increase thecost, and the increased timing indicates the high power consumption. Inaddition, for concurrently having the MDCT, AQMF, SQMF operationalcapabilities, it needs to design different hardware architectures forthe operations, which also indicates the additional cost for hardwaredesign.

Although the recursive discrete Fourier transforms (RDFTs) havedeveloped for many years and thus advanced, it is still required forfurther reducing the operational complexity and hardware cost andincreasing the data computational performance.

Therefore, it is desirable to provide an improved RDFT system tomitigate and/or obviate the aforementioned problems.

SUMMARY OF THE INVENTION

The object of the present invention is to provide a recursive type-IVdiscrete cosine transform system, which has a low operationalcomplexity, a low amount of used multiplication coefficients, and a highperformance data computation.

According to a feature of the present invention, a recursive type-IVdiscrete cosine transform system is provided, which includes a firstpermutation device, a recursive type-III discrete cosine/sine transformdevice, a cosine/sine factor generation device, a recursive type-IIdiscrete cosine/sine transform device, and a second permutation device.The first permutation device receives N digital input signals andperforms a two-dimensional order permutation operation on the N digitalsignals for generating N two-dimensional first temporal signals, where Nis a positive integer. The recursive type-III discrete cosine/sinetransform device is an m-point recursive type-III discrete cosine/sinetransform device connected to the first permutation device in order toreceive the N first temporal signals and repeat a type-III discretecosine/sine transform c times on the N first temporal signals forgenerating c second temporal signals each with m points, where N=m×c,and m, c are a positive integer. The cosine/sine factor generationdevice is connected to the recursive type-III discrete cosine/sinetransform device in order to sequentially perform cosine/sine factormultiplication and corresponding addition operations on the m-pointsecond temporal signals for generating c third temporal signals with mpoints. The recursive type-II discrete cosine/sine transform device is ac-point recursive type-II discrete cosine/sine transform deviceconnected to the cosine/sine factor generation device in order toreceive the third temporal signals and repeat a type-II discretecosine/sine transform in times for generating m fourth temporal signalseach with c points. The second permutation device is connected to therecursive type-II discrete cosine/sine transform device in order toreceive the fourth temporal signals and perform a one-dimensional orderpermutation operation on the fourth temporal signals for generating None-dimensional output signals, wherein the N one-dimensional outputsignals are obtained by performing a type-IV discrete cosine transformon the N digital input signals.

According to another feature of the present invention, a recursivetype-IV discrete cosine transform system is provided, which includes afirst permutation device, a modified recursive type-III discretecosine/sine transform device, a recursive type-II discrete cosine/sinetransform device, and a second permutation device. The first permutationdevice receives N digital input signals and performs a two-dimensionalorder permutation operation on the N digital signals for generating Ntwo-dimensional first temporal signals, where N is a positive integer.The modified recursive type-III discrete cosine/sine transform device isconnected to the first permutation device and has a first and a secondoperational modes such that in the first operational mode a type-IIIdiscrete cosine/sine transform is repeated c times on the N firsttemporal signals for generating c second temporal signals each with mpoints, where N=m×c, and m, c are a positive integer. The recursivetype-II discrete cosine/sine transform device is connected to themodified recursive type-III discrete cosine/sine transform device andhas a first and a second operational modes such that in the firstoperational mode a third temporal signal is received and a type-IIdiscrete cosine/sine transform is repeated m times on the third temporalsignal for generating m fourth temporal signals each with c points. Thesecond permutation device is connected to the recursive type-II discretecosine/sine transform device in order to receive the fourth temporalsignals and perform a one-dimensional order permutation operation on thefourth temporal signals for generating N one-dimensional output signals,wherein the N one-dimensional output signals are obtained by performinga type-IV discrete cosine transform on the N digital input signals.

Other objects, advantages, and novel features of the invention willbecome more apparent from the following detailed description when takenin conjunction with the accompanying drawings.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a schematic diagram of a recursive type-IV discrete cosinetransform system according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of a DCT-IV (discrete cosine transform oftype-IV) operation according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of input data mapping according to anembodiment of the present invention;

FIG. 4 is a schematic diagram of a relationship between indexes n₀, k₁according to an embodiment of the present invention;

FIG. 5 is a schematic diagram of a relationship between an m-pointDCT-III/DST-III and a c-point DCT-II/DST-II according to an embodimentof the present invention;

FIG. 6 is a schematic diagram of a pipelined DCT-IV according to anembodiment of the present invention;

FIG. 7 is a schematic diagram of using additional adder and registersaccording to an embodiment of the present invention;

FIG. 8 is a schematic diagram of using additional adder and registersaccording to an embodiment of the present invention;

FIG. 9 is a schematic diagram of a DCT-III/DST-III hardware architectureaccording to an embodiment of the present invention;

FIG. 10 is a schematic diagram of a hardware architecture correspondingto equation (56) according to an embodiment of the present invention;

FIG. 11 is a schematic diagram of using additional adder and registersaccording to an embodiment of the present invention;

FIG. 12 is a schematic diagram of a DCT-II/DST-II hardware architectureaccording to an embodiment of the present invention;

FIG. 13 is a schematic diagram of a recursive type-IV discrete cosinetransform system according to another embodiment of the presentinvention;

FIG. 14 is a schematic diagram of a cosine/sine factor generation deviceaccording the present invention;

FIG. 15 is a schematic diagram of a hardware action of inputting upperhalf data in a folding operation according to the present invention;

FIG. 16 is a schematic diagram of a hardware action of inputting lowerhalf data in a folding operation according to the present invention;

FIGS. 17(A) and 17(B) are schematic diagrams of a completeintermediate-stage operation architecture according to the presentinvention;

FIG. 18 is a schematic diagram of operations corresponding to haltcycles according to the present invention;

FIG. 19 is a schematic diagram of a modified recursive type-III discretecosine/sine transform device according to the present invention;

FIG. 20 is a schematic diagram of a recursive type-II discretecosine/sine transform device according to the present invention;

FIG. 21 is a schematic diagram of using common multiplexers according tothe present invention; and

FIGS. 22(A) and 22(B) are schematic diagrams of using common multipliersand adders according to an embodiment of the present invention.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT

FIG. 1 is a schematic diagram of a recursive type-IV discrete cosinetransform system 100 according to an embodiment of the presentinvention. The system 100 includes a first permutation device 110, arecursive type-III discrete cosine/sine transform device 120, acosine/sine factor generation device 130, a recursive type-II discretecosine/sine transform device 140, and a second permutation device 150.

The first permutation device 110 receives N digital input signals andperforms a two-dimensional order permutation operation on the N digitalsignals for generating N two-dimensional first temporal signals, where Nis a positive integer.

The recursive type-III discrete cosine/sine transform device 120, whichis an m-point recursive type-III discrete cosine/sine transform device,is connected to the first permutation device 110 in order to receive theN first temporal signals and repeat a type-III discrete cosine/sinetransform c times on the N first temporal signals for generating csecond temporal signals each with m points, where N===m×c, and m, c areeach a positive integer.

The cosine/sine factor generation device 130 is connected to therecursive type-III discrete cosine/sine transform device 120 in order tosequentially perform cosine/sine factor multiplication and correspondingaddition operations on the m-point second temporal signals forgenerating c third temporal signals with m points.

The recursive type-II discrete cosine/sine transform device 140, whichis a c-point recursive type-II discrete cosine/sine transform device, isconnected to the cosine/sine factor generation device 130 in order toreceive the third temporal signals and repeat a type-II discretecosine/sine transform m times for generating m fourth temporal signalseach with c points.

The second permutation device 150 is connected to the recursive type-IIdiscrete cosine/sine transform device 140 in order to receive the fourthtemporal signals and perform a one-dimensional order permutationoperation on the fourth temporal signals for generating None-dimensional output signals, wherein the N one-dimensional outputsignals are obtained by performing a type-IV discrete cosine transform(DCT-IV) on the N digital input signals.

For implementing a common architecture or co-architecture of analysisand synthesis filter-banks, the invention uses a DCT-IV kernel method toimplement the modified DCT (MDCT) and inverse MDCT (IMDCT).

The MDCT and IMDCT math models are defined respectively in equation (1)and equation (2), where k ranges from zero to (N/2)−1, n ranges fromzero to N−1, and M=N/2.

$\begin{matrix}{{{X_{c}\lbrack k\rbrack} = {\sum\limits_{n = 0}^{N - 1}\;{{x\lbrack n\rbrack} \times {\cos\left( \frac{\left( {{2n} + 1 + M} \right)\left( {{2k} + 1} \right)\pi}{2N} \right)}}}},} & (1) \\{{{\hat{x}}_{c}\lbrack n\rbrack} = {\sum\limits_{k = 0}^{\frac{N}{2} - 1}\;{{X_{c}\lbrack k\rbrack} \times {{\cos\left( \frac{\left( {{2n} + 1 + M} \right)\left( {{2k} + 1} \right)\pi}{2N} \right)}.}}}} & (2)\end{matrix}$

After an order permutation, the above equations are rewritten asequation (3) and equation (4):

$\begin{matrix}{{{X_{c}\lbrack k\rbrack} - {\sum\limits_{n = 0}^{{N/2} - 1}{{{Pre}_{c}\lbrack n\rbrack} \times {\cos\left( \frac{\left( {{2n} + 1} \right)\left( {{2k} + 1} \right)\pi}{2N} \right)}}}},} & (3) \\{{{{Post}_{c}\lbrack n\rbrack} = {\sum\limits_{k = 0}^{{N/2} - 1}{{X_{c}\lbrack k\rbrack} \times \cos\left( \frac{\left( {{2n} + 1} \right)\left( {{2k} + 1} \right)\pi}{2N} \right)}}},} & (4) \\{where} & \; \\{{{Pre}_{c}\lbrack n\rbrack} = {{- \left( {{x\left\lbrack {n + \frac{3N}{4}} \right\rbrack} + {x\left\lbrack {\frac{3N}{4} - 1 - n} \right\rbrack}} \right)}❘_{n = {{0\mspace{14mu}{to}\mspace{14mu}{N/4}} - 1}}}} & \; \\{{{{Pre}_{c}\lbrack n\rbrack} = {\left( {{x\left\lbrack {n - \frac{N}{4}} \right\rbrack} - {x\left\lbrack {\frac{3N}{4} - 1 - n} \right\rbrack}} \right)❘_{n = {{{N/4}\mspace{14mu}{to}\mspace{14mu}{N/2}} - 1}}}},} & (5) \\{{{{Post}_{c}\lbrack n\rbrack}❘_{n = {{0\mspace{14mu}{to}\mspace{14mu}{N/4}} - 1}}} = {{- {{\hat{x}}_{c}\left\lbrack {\frac{3N}{4} - 1 - n} \right\rbrack}} = {- {{{\hat{x}}_{c}\left\lbrack {n + \frac{3N}{4}} \right\rbrack}.}}}} & (6)\end{matrix}$

From equation (3) and equation (4), it is clearly known that the MDCTand IMDCT operation can be changed into a DCT-IV operation. In case ofeffectively sharing and reducing the DCT-IV operation, the computationalcomplexity can be relatively reduced for the processes.

As compare with a parallel architecture, a recursive circuit has theadvantages of small area, low power consumption, and flexible pointnumber, but it also has the disadvantages of excess operational periodsand overtime computation. For audio codec applications, such as longwindows of Advanced Audio Coding (AAC; 2048 points), TwinVQ (4096points), Ogg (up to 8192 points), the real-time computationalrequirement is difficult to be achieved due to the high point numbers.

Accordingly, the present invention applies a variable transform in aDCT-IV operation to thereby increase the speed of recursivearchitecture, and in this case an original one-dimensional computationequation is divided into two-dimensional operations to thereby shortenthe cycle of a recursive operation.

An M-point DCT-IV math model is defined in equation (7) as follows.

$\begin{matrix}{{{X\lbrack k\rbrack} = {\sum\limits_{n = 0}^{M - 1}\;{{x\lbrack n\rbrack} \times {\cos\left( \frac{\left( {{2n} + 1} \right)\left( {{2k} + 1} \right)\pi}{4M} \right)}}}},} & (7)\end{matrix}$where M=n×k. Assume n=n₀+c×n₁ and k=m×k₀+k₁, and plug it in equation(7), so

$\begin{matrix}{{{X\left\lbrack {{m \times k_{0}} + k_{1}} \right\rbrack} = {\sum\limits_{n_{0} = 0}^{c - 1}\;{\sum\limits_{n_{1} = 0}^{m - 1}{\left( {- 1} \right)^{n_{1}k_{0}} \times {x\left\lbrack {n_{0} + {c \times n_{1}}} \right\rbrack} \times {\cos\left( {\frac{\left( {{2k_{1}} + 1} \right)n_{1}\pi}{2m} + \frac{\left( {{2n_{0}} + 1} \right)\left( {{2k_{1}} + 1} \right)\pi}{4M} + \frac{\left( {{2n_{0}} + 1} \right)k_{0}\pi}{2c}} \right)}}}}},{{X\left\lbrack {{m \times k_{0}} + k_{1}} \right\rbrack} = {{\sum\limits_{n_{0} = 0}^{c - 1}\;{\sum\limits_{n_{1} = 0}^{m - 1}{\left( {- 1} \right)^{n_{1}k_{0}} \times {x\left\lbrack {n_{0} + {c \times n_{1}}} \right\rbrack} \times \left\{ {{{\cos\left( \frac{\left( {{2n_{0}} + 1} \right)k_{0}\pi}{2c} \right)} \times {\cos\left( {\frac{\left( {{2k_{1}} + 1} \right)n_{1}\pi}{2m} + \frac{\left( {{2n_{0}} + 1} \right)\left( {{2k_{1}} + 1} \right)\pi}{4M}} \right)}} - {{\sin\left( \frac{\left( {{2n_{0}} + 1} \right)k_{0}\pi}{2c} \right)} \times {\sin\left( {\frac{\left( {{2k_{1}} + 1} \right)n_{1}\pi}{2m} + \frac{\left( {{2n_{0}} + 1} \right)\left( {{2k_{1}} + 1} \right)\pi}{4M}} \right)}}} \right\}}}} = {{X_{c}\left\lbrack {{m \times k_{0}} + k_{1}} \right\rbrack} - {X_{c}\left\lbrack {{m \times k_{0}} + k_{1}} \right\rbrack}}}},} & (8) \\{\mspace{79mu}{where}} & \; \\{{{X_{c}\left\lbrack {{m \times k_{0}} + k_{1}} \right\rbrack} = {\sum\limits_{n_{0} = 0}^{c - 1}\;{{\cos\left( \frac{\left( {{2n_{0}} + 1} \right)k_{0}\pi}{2c} \right)}{\sum\limits_{n_{1} = 0}^{m - 1}\;{\left( {- 1} \right)^{n_{1}k_{0}} \times {x\left\lbrack {n_{0} + {c \times n_{1}}} \right\rbrack} \times {\cos\left( {\frac{\left( {{2k_{1}} + 1} \right)n_{1}\pi}{2m} + \frac{\left( {{2n_{0}} + 1} \right)\left( {{2k_{1}} + 1} \right)\pi}{4M}} \right)}}}}}},} & (9) \\{\mspace{79mu}{and}} & \; \\{{X_{s}\left\lbrack {{m \times k_{0}} + k_{1}} \right\rbrack} = {\sum\limits_{n_{0} = 0}^{c - 1}\;{{\sin\left( \frac{\left( {{2n_{0}} + 1} \right)k_{0}\pi}{2c} \right)}{\sum\limits_{n_{1} = 0}^{m - 1}\;{\left( {- 1} \right)^{n_{1}k_{0}} \times {x\left\lbrack {n_{0} + {c \times n_{1}}} \right\rbrack} \times {{\sin\left( {\frac{\left( {{2k_{1}} + 1} \right)n_{1}\pi}{2m} + \frac{\left( {{2n_{0}} + 1} \right)\left( {{2k_{1}} + 1} \right)\pi}{4M}} \right)}.}}}}}} & (10)\end{matrix}$

Upon the trigonometric functions' sum identities, Equation (3) can beexpanded as:

$\begin{matrix}{{X_{c}\left\lbrack {{m \times k_{0}} + k_{1}} \right\rbrack} = {{\sum\limits_{n_{0} = 0}^{c - 1}\;{{\cos\left( \frac{\left( {{2n_{0}} + 1} \right)k_{0}\pi}{2c} \right)}{\sum\limits_{n_{1} = 0}^{m - 1}{\left( {- 1} \right)^{n_{1}k_{0}} \times {x\left\lbrack {n_{0} + {c \times n_{1}}} \right\rbrack} \times \left\{ {{{\cos\left( \frac{\left( {{2n_{0}} + 1} \right)\left( {{2k_{1}} + 1} \right)\pi}{4M} \right)} \times {\cos\left( \frac{\left( {{2k_{1}} + 1} \right)n_{1}\pi}{2m} \right)}} - {{\sin\left( \frac{\left( {{2n_{0}} + 1} \right)\left( {{2k_{1}} + 1} \right)\pi}{4M} \right)} \times {\sin\left( \frac{\left( {{2k_{1}} + 1} \right)n_{1}\pi}{2m} \right)}}} \right\}}}}} = {\sum\limits_{n_{0} = 0}^{c - 1}\;{{\cos\left( \frac{\left( {{2n_{0}} + 1} \right)k_{0}\pi}{2c} \right)}\left\{ {{{\left\lbrack {\sum\limits_{n_{1} = 0}^{m - 1}{\left( {- 1} \right)^{n_{1}k_{0}} \times {x\left\lbrack {n_{0} + {c \times n_{1}}} \right\rbrack} \times {\cos\left( \frac{\left( {{2k_{1}} + 1} \right)n_{1}\pi}{2m} \right)} \times {\cos\left( \frac{\left( {{2n_{0}} + 1} \right)\left( {{2k_{1}} + 1} \right)\pi}{4M} \right)}}} \right\rbrack - \left\lbrack {\sum\limits_{n_{1} = 0}^{m - 1}{\left( {- 1} \right)^{n_{1}k_{0}} \times {x\left\lbrack {n_{0} + {c \times n_{1}}} \right\rbrack}{\sin\left( \frac{\left( {{2k_{1}} + 1} \right)n_{1}\pi}{2m} \right)} \times {\sin\left( \frac{\left( {{2n_{0}} + 1} \right)\left( {{2k_{1}} + 1} \right)\pi}{4M} \right)}}} \right\}} = {{X_{c\; 0}\left\lbrack {{m \times k_{0}} + k_{1}} \right\rbrack} - {X_{c\; 1}\left\lbrack {{m \times k_{0}} + k_{1}} \right\rbrack}}},} \right.}}}} & (11) \\{\mspace{79mu}{where}} & \; \\{{{X_{c\; 0}\left\lbrack {{m \times k_{0}} + k_{1}} \right\rbrack} = {\sum\limits_{n_{0} = 0}^{c - 1}\;{\sum\limits_{n_{1} = 0}^{m - 1}\;{\left( {- 1} \right)^{n_{1}k_{0}} \times {x\left\lbrack {n_{0} + {c \times n_{1}}} \right\rbrack} \times {\cos\left( \frac{\left( {{2k_{1}} + 1} \right)n_{1}\pi}{2m} \right)} \times {\cos\left( \frac{\left( {{2n_{0}} + 1} \right)\left( {{2k_{1}} + 1} \right)\pi}{4M} \right)} \times {\cos\left( \frac{\left( {{2n_{0}} + 1} \right)k_{0}\pi}{2c} \right)}}}}},} & (12) \\{\mspace{79mu}{and}} & \; \\{{X_{c\; 1}\left\lbrack {{m \times k_{0}} + k_{1}} \right\rbrack} = {\sum\limits_{n_{0} = 0}^{c - 1}\;{\sum\limits_{n_{1} = 0}^{m - 1}\;{\left( {- 1} \right)^{n_{1}k_{0}} \times {x\left\lbrack {n_{0} + {c \times n_{1}}} \right\rbrack} \times {\sin\left( \frac{\left( {{2k_{1}} + 1} \right)n_{1}\pi}{2m} \right)} \times {\sin\left( \frac{\left( {{2n_{0}} + 1} \right)\left( {{2k_{1}} + 1} \right)\pi}{4M} \right)} \times {{\cos\left( \frac{\left( {{2n_{0}} + 1} \right)k_{0}\pi}{2c} \right)}.}}}}} & (13)\end{matrix}$

Similarly, upon the trigonometric functions' sum identities, Equation(10) can be expended as:

$\begin{matrix}{{X_{s}\left\lbrack {{m \times k_{0}} + k_{1}} \right\rbrack} = {{\sum\limits_{n_{0} = 0}^{c - 1}\;{{\sin\left( \frac{\left( {{2n_{0}} + 1} \right)k_{0}\pi}{2c} \right)}{\sum\limits_{n_{1} = 0}^{m - 1}{\left( {- 1} \right)^{n_{1}k_{0}} \times {x\left\lbrack {n_{0} + {c \times n_{1}}} \right\rbrack} \times \left\{ {{{\sin\left( \frac{\left( {{2n_{0}} + 1} \right)\left( {{2k_{1}} + 1} \right)\pi}{4M} \right)} \times {\cos\left( \frac{\left( {{2k_{1}} + 1} \right)n_{1}\pi}{2m} \right)}} + {{\cos\left( \frac{\left( {{2n_{0}} + 1} \right)\left( {{2k_{1}} + 1} \right)\pi}{4M} \right)} \times {\sin\left( \frac{\left( {{2k_{1}} + 1} \right)n_{1}\pi}{2m} \right)}}} \right\}}}}} = {\sum\limits_{n_{0} = 0}^{c - 1}\;{{\sin\left( \frac{\left( {{2n_{0}} + 1} \right)k_{0}\pi}{2c} \right)}\left\{ {{{\left\lbrack {\sum\limits_{n_{1} = 0}^{m - 1}{\left( {- 1} \right)^{n_{1}k_{0}} \times {x\left\lbrack {n_{0} + {c \times n_{1}}} \right\rbrack} \times {\sin\left( \frac{\left( {{2n_{0}} + 1} \right)\left( {{2k_{1}} + 1} \right)\pi}{4M} \right)} \times {\cos\left( \frac{\left( {{2k_{1}} + 1} \right)n_{1}\pi}{2m} \right)}}} \right\rbrack + \left\lbrack {\sum\limits_{n_{1} = 0}^{m - 1}{\left( {- 1} \right)^{n_{1}k_{0}} \times {x\left\lbrack {n_{0} + {c \times n_{1}}} \right\rbrack} \times \left. \quad{{\cos\left( \frac{\left( {{2n_{0}} + 1} \right)\left( {{2k_{1}} + 1} \right)\pi}{4M} \right)} \times {\sin\left( \frac{\left( {{2k_{1}} + 1} \right)n_{1}\pi}{2m} \right)}} \right\rbrack}} \right\}} = {{X_{s\; 0}\left\lbrack {{m \times k_{0}} + k_{1}} \right\rbrack} + {X_{s\; 1}\left\lbrack {{m \times k_{0}} + k_{1}} \right\rbrack}}},} \right.}}}} & (14) \\{\mspace{79mu}{where}} & \; \\{{{X_{s\; 0}\left\lbrack {{m \times k_{0}} + k_{1}} \right\rbrack} = {\sum\limits_{n_{0} = 0}^{c\mspace{14mu} 1}\;{\sum\limits_{n_{1} = 0}^{m\mspace{14mu} 1}\;{\left( {- 1} \right)^{n_{1}k_{0}} \times {x\left\lbrack {n_{0} + {c \times n_{1}}} \right\rbrack} \times {\cos\left( \frac{\left( {{2k_{1}} + 1} \right)n_{1}\pi}{2m} \right)} \times {\sin\left( \frac{\left( {{2n_{0}} + 1} \right)\left( {{2k_{1}} + 1} \right)\pi}{4M} \right)} \times {\sin\left( \frac{\left( {{2n_{0}} + 1} \right)k_{0}\pi}{2c} \right)}}}}},} & (15) \\{\mspace{79mu}{and}} & \; \\{{X_{s\; 1}\left\lbrack {{m \times k_{0}} + k_{1}} \right\rbrack} = {\sum\limits_{n_{0} - 0}^{c\mspace{14mu} 1}\;{\sum\limits_{n_{1} - 0}^{m\mspace{14mu} 1}\;{\left( {- 1} \right)^{n_{1}k_{0}} \times {x\left\lbrack {n_{0} + {c \times n_{1}}} \right\rbrack} \times {\sin\left( \frac{\left( {{2k_{1}} + 1} \right)n_{1}\pi}{2m} \right)} \times {\cos\left( \frac{\left( {{2n_{0}}❘1} \right)\left( {{2k_{1}}❘1} \right)\pi}{4M} \right)} \times {{\sin\left( \frac{\left( {{2n_{0}}❘1} \right)k_{0}\pi}{2c} \right)}.}}}}} & (16)\end{matrix}$

By considering a change of the index k₀, the kernel operation inequation (12), equation (13), equation (15), equation (16) is definedas:

$\begin{matrix}{{{A\left( {n_{0},k_{1},k_{0}} \right)} = {\sum\limits_{n_{1} = 0}^{m\; - 1}{\left( {- 1} \right)^{n_{1}k_{0}} \times {x\left\lbrack {n_{0} + {c \times n_{1}}} \right\rbrack} \times {\cos\left( \frac{\left( {{2k_{1}} + 1} \right)n_{1}\pi}{2m} \right)}}}},} & (17) \\{{B\left( {n_{0},k_{1},k_{0}} \right)} = {\sum\limits_{n_{1} = 0}^{m\; - 1}{\left( {- 1} \right)^{n_{1}k_{0}} \times {x\left\lbrack {n_{0} + {c \times n_{1}}} \right\rbrack} \times {{\sin\left( \frac{\left( {{2k_{1}} + 1} \right)n_{1}\pi}{2m} \right)}.}}}} & (18)\end{matrix}$

If k₀ is an odd number,

$\begin{matrix}{{{A\left( {n_{0},k_{1},1} \right)} = {\sum\limits_{n_{1} = 0}^{m\; - 1}{\left( {- 1} \right)^{n_{1}} \times {x\left\lbrack {n_{0} + {c \times n_{1}}} \right\rbrack} \times {\cos\left( \frac{\left( {{2k_{1}} + 1} \right)n_{1}\pi}{2m} \right)}}}},} & (19) \\{{B\left( {n_{0},k_{1},1} \right)} = {\sum\limits_{n_{1} = 0}^{m\; - 1}{\left( {- 1} \right)^{n_{1}} \times {x\left\lbrack {n_{0} + {c \times n_{1}}} \right\rbrack} \times {{\sin\left( \frac{\left( {{2k_{1}} + 1} \right)n_{1}\pi}{2m} \right)}.}}}} & (20)\end{matrix}$

If k₀ is an even number,

$\begin{matrix}{{{A\left( {n_{0},k_{1},0} \right)} = {\sum\limits_{n_{1} = 0}^{m\; - 1}{{x\left\lbrack {n_{0} + {c \times n_{1}}} \right\rbrack} \times {\cos\left( \frac{\left( {{2k_{1}} + 1} \right)n_{1}\pi}{2m} \right)}}}},} & (21) \\{{B\left( {n_{0},k_{1},0} \right)} = {\sum\limits_{n_{1} = 0}^{m\; - 1}{{x\left\lbrack {n_{0} + {c \times n_{1}}} \right\rbrack} \times {{\sin\left( \frac{\left( {{2k_{1}} + 1} \right)n_{1}\pi}{2m} \right)}.}}}} & (22)\end{matrix}$

From equation (19) and equation (21), it is known that the change of k₀only has two types of A(n₀, k₁, 1) and A(n₀, k₁, 0) with respect toA(n₀, k₁, k₀). Similarly, from equation (20) and equation (22), it isknown that the change of k₀ only has two types of B(n₀, k₁, 1) and B(n₀,k₁, 0) with respect to B(n₀, k₁, k₀). Such a feature can relativelyreduce the computational amount of Equation (17) and Equation (18).

Let k₁=m−1−k₁, and plug it in equation (19) to thereby derive therelation between equation (19) and equation (21), so as to have:

$\begin{matrix}\begin{matrix}{{A\left( {n_{0},{m - 1 - k_{1}},1} \right)} = {\sum\limits_{n_{1} = 0}^{m\; - 1}{\left( {- 1} \right)^{n_{1}} \times {x\left\lbrack {n_{0} + {c \times n_{1}}} \right\rbrack} \times}}} \\{\cos\left( {{n_{1}\pi} + \frac{\left( {{2k_{1}} + 1} \right)n_{1}\pi}{2m}} \right)} \\{= {\sum\limits_{n_{1} = 0}^{m\; - 1}{\left( {- 1} \right)^{2n_{1}} \times {x\left\lbrack {n_{0} + {c \times n_{1}}} \right\rbrack} \times}}} \\{\cos\left( \frac{\left( {{2k_{1}} + 1} \right)n_{1}\pi}{2m} \right)} \\{= {\sum\limits_{n_{1} - 0}^{m\; - 1}{{x\left\lbrack {n_{0} + {c \times n_{1}}} \right\rbrack} \times {\cos\left( \frac{\left( {{2k_{1}} + 1} \right)n_{1}\pi}{2m} \right)}}}} \\{= {{A\left( {n_{0},k_{1},0} \right)}.}}\end{matrix} & (23)\end{matrix}$

Similarly, let k₁=m−1−k₁, and plug it in Equation (20) to thereby derivethe relation between Equation (20) and Equation (22), so

$\begin{matrix}\begin{matrix}{{B\left( {n_{0},{m - 1 - k_{1}},1} \right)} = {\sum\limits_{n_{1} = 0}^{m\; - 1}{\left( {- 1} \right)^{n_{1}} \times {x\left\lbrack {n_{0} + {c \times n_{1}}} \right\rbrack} \times}}} \\{\sin\left( {{n_{1}\pi} + \frac{\left( {{2k_{1}} + 1} \right)n_{1}\pi}{2m}} \right)} \\{= {\sum\limits_{n_{1} = 0}^{m\; - 1}{\left( {- 1} \right)^{{2n_{1}} + 1} \times {x\left\lbrack {n_{0} + {c \times n_{1}}} \right\rbrack} \times}}} \\{\sin\left( \frac{\left( {{2k_{1}} + 1} \right)n_{1}\pi}{2m} \right)} \\{= {- {\sum\limits_{n_{1} - 0}^{m\; - 1}{{x\left\lbrack {n_{0} + {c \times n_{1}}} \right\rbrack} \times {\sin\left( \frac{\left( {{2k_{1}} + 1} \right)n_{1}\pi}{2m} \right)}}}}} \\{= {- {{B\left( {n_{0},k_{1},0} \right)}.}}}\end{matrix} & (24)\end{matrix}$

Accordingly, from equation (23) and equation (24), we have:A(n ₀ ,k ₁,1)=A(n ₀ ,m−1−k ₁,0)  (25)B(n ₀ ,k ₁,1)=−B(n ₀ ,m−1−k ₁,0).  (26)

By means of equation (25) and equation (26), the operations of equation(19) and equation (20) can be simplified.

By plugging the results in equation (12), equation (13), equation (15),equation (16), we have:

$\begin{matrix}{{{X_{c\; 0}\left\lbrack {{m \times k_{0}} + k_{1}} \right\rbrack} = {\sum\limits_{n_{0} = 0}^{c - 1}\;{{A\left( {n_{0},k_{1},k_{0}} \right)} \times {\cos\left( \frac{\left( {{2n_{0}} + 1} \right)\left( {{2k_{1}} + 1} \right)\pi}{4M} \right)} \times {\cos\left( \frac{\left( {{2n_{0}} + 1} \right)k_{0}\pi}{2c} \right)}}}},} & (27) \\{{{X_{c\; 1}\left\lbrack {{m \times k_{0}} + k_{1}} \right\rbrack} = {\sum\limits_{n_{0} = 0}^{c - 1}\;{{B\left( {n_{0},k_{1},k_{0}} \right)} \times {\sin\left( \frac{\left( {{2n_{0}} + 1} \right)\left( {{2k_{1}} + 1} \right)\pi}{4M} \right)} \times {\cos\left( \frac{\left( {{2n_{0}} + 1} \right)k_{0}\pi}{2c} \right)}}}},} & (28) \\{{{X_{s\; 0}\left\lbrack {{m \times k_{0}} + k_{1}} \right\rbrack} = {\sum\limits_{n_{0} = 0}^{c - 1}\;{{A\left( {n_{0},k_{1},k_{0}} \right)} \times {\sin\left( \frac{\left( {{2n_{0}} + 1} \right)\left( {{2k_{1}} + 1} \right)\pi}{4M} \right)} \times {\sin\left( \frac{\left( {{2n_{0}} + 1} \right)k_{0}\pi}{2c} \right)}}}},} & (29) \\{{X_{s\; 1}\left\lbrack {{m \times k_{0}} + k_{1}} \right\rbrack} = {\sum\limits_{n_{0} = 0}^{c - 1}\;{{B\left( {n_{0},k_{1},k_{0}} \right)} \times {\cos\left( \frac{\left( {{2n_{0}} + 1} \right)\left( {{2k_{1}} + 1} \right)\pi}{4M} \right)} \times {{\sin\left( \frac{\left( {{2n_{0}} + 1} \right)k_{0}\pi}{2c} \right)}.}}}} & (30) \\{\mspace{79mu}{Since}} & \; \\\begin{matrix}{\mspace{79mu}{{X\left\lbrack {{m \times k_{0}} + k_{1}} \right\rbrack} = {{X_{c}\left\lbrack {{m \times k_{0}} + k_{1}} \right\rbrack} - {X_{s}\left\lbrack {{m \times k_{0}} + k_{1}} \right\rbrack}}}} \\{= {\left( {{X_{c\; 0}\left\lbrack {{m \times k_{0}} + k_{1}} \right\rbrack} - {X_{c\; 1}\left\lbrack {{m \times k_{0}} + k_{1}} \right\rbrack}} \right) -}} \\{\left( {{X_{s\; 0}\left\lbrack {{m \times k_{0}} + k_{1}} \right\rbrack} + {X_{s\; 1}\left\lbrack {{m \times k_{0}} + k_{1}} \right\rbrack}} \right),}\end{matrix} & \;\end{matrix}$we have:

$\begin{matrix}{{{X_{c}\left\lbrack {{m \times k_{0}} + k_{1}} \right\rbrack} = {\sum\limits_{n_{0} = 0}^{c - 1}\;{{T_{c}\left( {n_{0},k_{1},k_{0}} \right)} \times {\cos\left( \frac{\left( {{2n_{0}} + 1} \right)k_{0}\pi}{2c} \right)}}}},} & (31) \\{\mspace{79mu}{and}} & \; \\{{T_{c}\left( {n_{0},k_{1},k_{0}} \right)} - {{A\left( {n_{0},k_{1},k_{0}} \right)} \times {\cos\left( \frac{\left( {{2n_{0}} + 1} \right)\left( {{2k_{1}} + 1} \right)\pi}{4M} \right)}} - {{B\left( {n_{0},k_{1},k_{0}} \right)} \times {{\sin\left( \frac{\left( {{2n_{0}} + 1} \right)\left( {{2k_{1}} + 1} \right)\pi}{4M} \right)}.}}} & (32) \\{\mspace{79mu}{while}} & \; \\{{{X_{s}\left\lbrack {{m \times k_{0}} + k_{1}} \right\rbrack} = {\sum\limits_{n_{0} = 0}^{c - 1}\;{{T_{s}\left( {n_{0},k_{1},k_{0}} \right)} \times {\sin\left( \frac{\left( {{2n_{0}} + 1} \right)k_{0}\pi}{2c} \right)}}}},} & (33) \\{\mspace{79mu}{and}} & \; \\{{T_{s}\left( {n_{0},k_{1},k_{0}} \right)} - {{A\left( {n_{0},k_{1},k_{0}} \right)} \times {\sin\left( \frac{\left( {{2n_{0}} + 1} \right)\left( {{2k_{1}} + 1} \right)\pi}{4M} \right)}} + {{B\left( {n_{0},k_{1},k_{0}} \right)} \times {{\cos\left( \frac{\left( {{2n_{0}} + 1} \right)\left( {{2k_{1}} + 1} \right)\pi}{4M} \right)}.}}} & (34)\end{matrix}$

By summarizing the derivations, it is seen that the input signals passthrough the DCT-II process of equation (21) and the DCT-III process ofequation (22), then are multiplied by the respective cosine and sinefactors in equation (32) and equation (34), and finally pass through theDCT-II process of equation (31) and the DCT-II process of equation (33).Thus, the faster DCT-IV operation is obtained.

Let n−0˜M−1, k=0˜M−1, M=m×c, n₀−0˜c−1, k₀=0˜c−1, n₁=0˜m−1, and k₁=U˜m−1,the complete M-point DCT-IV method can be written into the steps asfollows.

1. The input signals are based on n=n₀+c×n₁ to be arranged as atwo-dimensional order permutation.

2. The arranged data (the permutation) is input to an m-pointDCT-III/DST-III hardware.

3. The resultant transformed by the m-point DCT-III/DST-II hardware isoperated with the cosine and sine factors.

4. The resultant after the operation is input to a c-pointDCT-III/DST-III hardware.

5. The results transformed by the c-point DCT-III and DST-III hardwareare subtracted and permuted based on k=m×k₀+k₁.

Steps (1) and (5), which are regarded as pre- and post-processing,essentially perform the permutation, addition, and subtractionoperations, and steps (2)-(4) are the operations of the kernel hardwarearchitecture. The steps above are shown in FIG. 2 in which is aschematic diagram of a DCT-IV (discrete cosine transform of type-IV)operation according to an embodiment of the embodiment. In addition, inFIG. 2, it is easy to see that step (2) requires repeating the m-pointDCT-III/DST-III c times. In this case, c DCT-III/DST-III circuits arerequired in case of only using the hardware in implementation. Such animplementation requires more hardware sources and is difficult to adjustthe hardware resources for different points. Therefore, for supportingmultiple audio standards with different points, the invention uses aDCT-III/DST-III hardware to repeat the operation c times, and fordifferent points, only the number of operations is adjusted. Similarly,step (4) requires repeating the c-point DCT-II/DST-II operation m times,and it is implemented by repeating the operation m times on aDCT-II/DST-II hardware.

As cited above, the invention divides the M-point DCT-IV operation intoan m-point DCT-III/DST-III operation and a c-point DCT-II/DST-IIoperation. In viewing FIG. 2, it is seen that the result of the m-pointDCT-III/DST-III operation is operated with the cosine and sine factorsto thereby become an input data for the c-point DCT-II/DST-II operation.Therefore, the complete architecture can be generally divided into threestages, the first stage indicating the m-point DCT-III/DST-IIIoperation, the immediate stage indicating the cosine and sine factoroperation, and the second stage indicating the c-point DCT-II/DST-IIoperation. Thus, the invention can arrange a sequence of data operationsfor allowing each stage to independently perform and increasing theperformance in a pipelined form.

At first, the input data of the sequence is arranged into atwo-dimensional order permutation based on n=n₀+c×n₁, as shown in FIG. 3in which is a schematic diagram of input data mapping according to anembodiment of the embodiment. Next, the computational order firstly setsk₁, n_(o) to a respective initial value, then adds n_(o) by one after anm-time recursion, and finally repeats the addition until n_(o)=c−1,which indicates that the c m-point DCT-III/DST-III operations arecomplete. Next, the c results can be delivered to a next-stage forperforming c-point DCT-II/DST-II operations, adding k₁ by one, andsetting n_(o), to zero, as shown in FIG. 4 in which is a schematicdiagram of the relationship between the indexes n₀, k₁ according to anembodiment of the embodiment. As cited above, it is known that such anoperation can make the data streams independent, so the pipeline canwork completely, as shown in FIG. 5 in which is a schematic diagram ofthe relationship between the m-point DCT-II/DST-III and the c-pointDCT-II/DST-II operations according to an embodiment of the embodiment.

The duration required for a pipeline depends on which stage of circuitrequires the highest time in operation. In practice, the operating speedof the first stage has to be smaller than or equal to that of the secondstage, i.e., m≧c. In addition, when m=c, the pipeline can achieve theoptimal efficiency. FIG. 6 is a schematic diagram of a pipelined DCT-IVaccording to an embodiment of the present invention.

With the pipelined scheme, the number of cycles is improved by c timesas compared with the conventional method. However, a certain number ofregisters are relatively increased for an exchange. Since the datatransfer between the stages requires the registers for storing, thenumber of registers required for the architecture is determined by c.Namely, the number of registers is increased with increasing themultiple of speeding.

Upon FIG. 2, the DCT-II/DST-III hardware architecture is designed. Thedesign is focused on how to provide a recursive DCT-III/DST-III hardwarearchitecture with a low operational period to thereby improve theslow-speed recursive architecture in the prior art. In addition, on thehardware design, the sharing scheme is expected, which can allow thedesigned hardware to concurrently have the DCT-III and DST-IIIoperational capabilities to thereby reduce the hardware cost.

Equation (35) and Equation (36) are defined as m-point DCT-III andDST-III math models respectively. For input signals y[n₁] and z[n] andoutput signals Y_(DCT-III)[k₁] and Z_(DCT-III)[k₁], n₁=0˜m−1, k₁=0˜m−1,

$\begin{matrix}{{{Y_{{DCT}\text{-}{III}}\left\lbrack k_{1} \right\rbrack} = {\sum\limits_{n_{1} = 0}^{m - 1}\;{{y\left\lbrack n_{1} \right\rbrack} \times {\cos\left( \frac{\left( {{2k_{1}} + 1} \right)n_{1}\pi}{2m} \right)}}}},} & (35) \\{{Z_{{DST}\text{-}{III}}\left\lbrack k_{1} \right\rbrack} = {\sum\limits_{n_{1} = 0}^{m - 1}\;{{z\left\lbrack n_{1} \right\rbrack} \times {{\sin\left( \frac{\left( {{2k_{1}} + 1} \right)n_{1}\pi}{2m} \right)}.}}}} & (36)\end{matrix}$

For different applications, the m-point number can be odd or even, whichis separately discussed as follows.

(A) m is an Odd Number

If m is an odd number, Equation (35) is rewritten as Equation (37),Equation (38), and Equation (39):

$\begin{matrix}{\mspace{79mu}{{{Y_{{DCT}\text{-}{III}}\left\lbrack k_{1} \right\rbrack} = {\sum\limits_{n_{1} = 0}^{m - 1}\;{{y\left\lbrack n_{1} \right\rbrack} \times {\cos\left( \frac{\left( {{2k_{1}} + 1} \right)n_{1}\pi}{2m} \right)}}}},\mspace{79mu}{{{for}\mspace{14mu} k_{1}} = {{\left. 0 \right.\sim\left( {m - 3} \right)}/2}},}} & (37) \\{{{Y_{{DCT}\text{-}{III}}\left\lbrack {m - 1 - k_{1}} \right\rbrack} = {\sum\limits_{n_{1} = 0}^{m - 1}\;{\left( {- 1} \right)^{n_{1}} \times {y\left\lbrack n_{1} \right\rbrack} \times {\cos\left( \frac{\left( {{2k_{1}} + 1} \right)n_{1}\pi}{2m} \right)}}}},\mspace{79mu}{{{for}\mspace{14mu} k_{1}} = {{\left. 0 \right.\sim\left( {m - 3} \right)}/2}},} & (38) \\{{{Y_{{DCT}\text{-}{III}}\left\lbrack {\left( {m - 1} \right)/2} \right\rbrack} = {{y\lbrack 0\rbrack} - {y\lbrack 2\rbrack} + {y\lbrack 4\rbrack} - {y\lbrack 6\rbrack} +}},\ldots\mspace{14mu},{- {{y\left\lbrack {m - 1} \right\rbrack}.}}} & (39)\end{matrix}$

From equation (37), equation (38), and equation (39), it is known thatthe data throughput per transformation (DTPT) in equation (35) isdoubled, so that only m×(m−1)/2 cycles are required for completing them-point DCT-III computation, but the operation in equation (39) requiresusing additional adders and registers, as shown in FIG. 7 in which is aschematic diagram of using the additional adder and registers accordingto an embodiment of the embodiment. In this case, when the operation inequation (39) is implemented with a common hardware, as shown in FIG. 9in which is a schematic diagram of the DCT-III/DST-III hardwarearchitecture according to an embodiment of the present invention, thecomputational period to be required for DCT-III includes m×(m+1)/2cycles.

Similarly, equation (36) can be rewritten as equation (40), equation(41), and equation (42).

$\begin{matrix}{\mspace{79mu}{{{z_{{DST}\text{-}{III}}\left\lbrack k_{1} \right\rbrack} = {\sum\limits_{n_{1} = 0}^{m - 1}\;{{z\left\lbrack n_{1} \right\rbrack} \times {\sin\left( \frac{\left( {{2k} + 1} \right)n_{1}\pi}{2m} \right)}}}},\mspace{79mu}{{{for}\mspace{14mu} k_{1}} = {{\left. 0 \right.\sim\left( {m - 3} \right)}/2}},}} & (40) \\{{{Z_{{DST}\text{-}{III}}\left\lbrack {m - 1 - k_{1}} \right\rbrack} = {\sum\limits_{n_{1} = 0}^{m - 1}\;{\left( {- 1} \right)^{n_{1} + 1} \times {z\left\lbrack n_{1} \right\rbrack} \times {\sin\left( \frac{\left( {{2k} + 1} \right)n_{1}\pi}{2m} \right)}}}},\mspace{79mu}{{{for}\mspace{14mu} k_{1}} = {{\left. 0 \right.\sim\left( {m - 3} \right)}/2}},} & (41) \\{{{Z_{{DST}\text{-}{III}}\left\lbrack {\left( {m - 1} \right)/2} \right\rbrack} = {{z\lbrack 1\rbrack} - {y\lbrack 3\rbrack} + {y\lbrack 5\rbrack} - {y\lbrack 7\rbrack} +}},\ldots\mspace{14mu},{+ {{y\left\lbrack {m - 2} \right\rbrack}.}}} & (42)\end{matrix}$

From equation (40), equation (41), and equation (42), it is known thatthe DTPT in equation (36) is doubled, so only m×(m−1)/2 cycles arerequired for completing the m-point DCT-III computation, but theoperation in equation (42) requires using additional adders andregisters, as shown in FIG. 8 in which is a schematic diagram of usingthe additional adder and registers according to an embodiment of theembodiment. In this case, when the operation in equation (42) isimplemented with a common hardware, as shown in FIG. 9, thecomputational period to be required for DCT-III includes m×(m+1)/2cycles.

Next, for allowing DST-III and DCT-III to share the cosine coefficients,Equation (43) is derived from Equation (40), and Equation (44) isderived from Equation (41).

$\begin{matrix}{\mspace{79mu}{{{Z_{{DST}\text{-}{III}}\left\lbrack k_{1} \right\rbrack} = {\left( {- 1} \right)^{k_{1}} \times {\sum\limits_{n_{1} = 0}^{m - 1}\;{{z\left\lbrack {m - n_{1}} \right\rbrack} \times {\cos\left( \frac{\left( {{2k_{1}} + 1} \right)n_{1}\pi}{2m} \right)}}}}},\mspace{79mu}{{{for}\mspace{14mu} k_{1}} = {{\left. 0 \right.\sim\left( {m - 3} \right)}/2.}}}} & (43) \\{{{Z_{{DST}\text{-}{III}}\left\lbrack {m - 1 - k_{1}} \right\rbrack} = {\left( {- 1} \right)^{k_{1}} \times {\sum\limits_{n_{1} - 0}^{m - 1}\;{\left( {- 1} \right)^{m - n_{1} + 1} \times {z\left\lbrack {m - n_{1}} \right\rbrack} \times {\cos\left( \frac{\left( {{2k_{1}} + 1} \right)n_{1}\pi}{2m} \right)}}}}},\mspace{79mu}{{{for}\mspace{14mu} k_{1}} = {{\left. 0 \right.\sim\left( {m - 3} \right)}/2.}}} & (44)\end{matrix}$

From Equation (43) and Equation (44), it is known that the cosinecoefficients for DCT-II can be shared by simply ordering the inputsignals for DST-III and adjusting the positive and negative signs tothereby produce the operational result for DST-III and save the hardwarecost.

(B) m is an Even Number:

If m is an even number, equation (35) can be rewritten as equation (45)and equation (46):

$\begin{matrix}{\mspace{79mu}{{{Y_{{DCT}\text{-}{III}}\left\lbrack k_{1} \right\rbrack} = {\sum\limits_{n_{1} = 0}^{m - 1}\;{{y\left\lbrack n_{1} \right\rbrack} \times {\cos\left( \frac{\left( {{2k_{1}} + 1} \right)n_{1}\pi}{2m} \right)}}}},\mspace{79mu}{{{for}\mspace{14mu} k_{1}} = {{\left. 0 \right.\sim\left( {m - 1} \right)}/2.}}}} & (45) \\{{{Y_{{DCT}\text{-}{III}}\left\lbrack {m - 1 - k_{1}} \right\rbrack} = {\sum\limits_{n_{1} = 0}^{m - 1}\;{\left( {- 1} \right)^{n_{1}} \times {y\left\lbrack n_{1} \right\rbrack} \times {\cos\left( \frac{\left( {{2k_{1}} + 1} \right)n_{1}\pi}{2m} \right)}}}},\mspace{79mu}{{{for}\mspace{14mu} k_{1}} = {{\left. 0 \right.\sim\left( {m - 1} \right)}/2.}}} & (46)\end{matrix}$

From equation (45) and equation (46), it is known that the DTPT inequation (35) is doubled, so only m²/2 cycles are required forcompleting the m-point DST-III computation.

Similarly, equation (36) can be rewritten as equation (47) and equation(48):

$\begin{matrix}{{{Z_{{DST}\text{-}{III}}\left\lbrack k_{1} \right\rbrack} = {\sum\limits_{n_{1} = 0}^{m - 1}\;{{z\left\lbrack n_{1} \right\rbrack} \times {\sin\left( \frac{\left( {{2k_{1}} + 1} \right)n_{1}\pi}{2m} \right)}}}},{{{for}\mspace{14mu} k_{1}} = {{\left. 0 \right.\sim\left( {m - 1} \right)}/2.}}} & (47) \\{{{Z_{{DST}\text{-}{III}}\left\lbrack {m - 1 - k_{1}} \right\rbrack} = {\sum\limits_{n_{1} = 0}^{m - 1}\;{\left( {- 1} \right)^{n_{1} + 1} \times {z\left\lbrack n_{1} \right\rbrack} \times {\sin\left( \frac{\left( {{2k_{1}} + 1} \right)n_{1}\pi}{2m} \right)}}}},\mspace{79mu}{{{for}\mspace{14mu} k_{1}} = {{\left. 0 \right.\sim\left( {m - 1} \right)}/2.}}} & (48)\end{matrix}$

Next, for allowing DST-III and DCT-III to share the cosine coefficients,equation (49) is derived from equation (47), and equation (50) isderived from equation (48):

$\begin{matrix}{{{Z_{{DST}\text{-}{III}}\left\lbrack k_{1} \right\rbrack} = {\left( {- 1} \right)^{k_{1}} \times {\sum\limits_{n_{1} - 0}^{m - 1}\;{{z\left\lbrack {m - n_{1}} \right\rbrack} \times {\cos\left( \frac{\left( {{2k_{1}} + 1} \right)\left( {m - n_{1}} \right)\pi}{2m} \right)}}}}},\mspace{79mu}{{{for}\mspace{14mu} k_{1}} = {{\left. 0 \right.\sim\left( {m - 1} \right)}/2.}}} & (49) \\{{{Z_{{DST}\text{-}{III}}\left\lbrack {m - 1 - k_{1}} \right\rbrack} = {\left( {- 1} \right)^{k_{1}} \times {\sum\limits_{n_{1} = 0}^{m - 1}\;{\left( {- 1} \right)^{({m - n_{1} + 1})} \times {z\left\lbrack {m - n_{1}} \right\rbrack} \times {\cos\left( \frac{\left( {{2k_{1}} + 1} \right)n_{1}\pi}{2m} \right)}}}}},\mspace{79mu}{{{for}\mspace{14mu} k_{1}} = {{\left. 0 \right.\sim\left( {m - 1} \right)}/2.}}} & (50)\end{matrix}$

From equation (49) and equation (50), it is known that the cosinecoefficients for DCT-III can be shared by simply ordering the inputsignals for DST-III and adjusting the positive and negative signs tothereby produce the operational result for DST-III and save the hardwarecost.

Finally, upon equation (51) and equation (52), the Chebyshev polynomialsare:cos(n ₁θ_(k) ₂ )=2 cos(θ_(k) ₁ )cos((n ₁−1)θ_(k) ₁ )−cos((n ₁−2)θ_(k) ₂),  (51)sin(n ₁θ_(k) ₁ )=2 cos(θ_(k) ₂ )sin((n ₁−1)θ_(k) ₂ )−sin((n ₁−2)θ_(k) ₁),  (52)where

$\theta_{k_{1}} = {\frac{\left( {{2k_{1}} + 1} \right)\pi}{2m}.}$Expanding the Chebyshev polynomials, we have:

$\begin{matrix}{{\cos\left( \frac{\left( {{2k_{1}} + 1} \right)n_{1}\pi}{2m} \right)} = {{2 \times {\cos\left( \frac{\left( {{2k_{1}} + 1} \right)\pi}{2m} \right)}{\cos\left( \frac{\left( {{2k_{1}} + 1} \right)\left( {n_{1} - 1} \right)\pi}{2m} \right)}} - {{\cos\left( \frac{\left( {{2k_{1}} + 1} \right)\left( {n_{1} - 2} \right)\pi}{2m} \right)}.}}} & (53)\end{matrix}$

The initial values, cos((2k₁+1)π/2m), 1, cos((2k₁+1)(−1)π/2m),respectively for three cosine functions at the right side of theequality in equation (53) in the invention can be obtained by pluggingn₁=1.

Since cos((2k₁+1)(−1)π/2m)=cos((2k₁+1)π/2m), the cosine coefficientswith a same k1 and different n1 can be generated by the recursiveoperation in equation (53), with accessing cos((2k₁+1)π/2m) and thememory requirement of m words only.

Next, plugging equation (53) in equation (37), equation (38), equation(43), equation (44), equation (45), equation (46), the method with thelow operation cycle and DCT-III/DST-III operational capabilities isobtained. In addition, the operations in equation (39) and equation (50)require addition only, which can be implemented by the common adderswith other operations. In this case, the hardware architecture isdesigned as that shown in FIG. 9.

In FIG. 9, y[n₁] indicates the input signal Input1, and z[n₁] indicatesthe input signal Input2.

When m is an odd number, the output signal Output1 corresponds to theresult operated in equation (37) or equation (39), the output signalOutput2 corresponds to the result operated in equation (38), the outputsignal Output3 corresponds to the result operated in equation (44), andthe output signal Output4 corresponds to the result operated in equation(43) or equation (42).

When m is an even number, the output signal Output1 corresponds to theresult operated in equation (45), the output signal Output2 correspondsto the result operated in equation (46), the output signal Output3corresponds to the result operated in equation (48), and the outputsignal Output4 corresponds to the result operated in equation (47). Theparameters corresponding to the output signals in FIG. 9 are shown inTable 5.3.1 as follows.

TABLE 5.3.1 DCT-III/DST-III hardware and method relative table HardwareI/O Corresponding Parameters Input1 x(n₀, n₁) Input2 x(n₀, m − n₁)Output1 A(n₀, k₁, 0), A(n₀, m − 1 − k₁, 1) Output2 A(n₀, m − 1 − k₁, 0),A(n₀, k₁, 1) Output3 B(n₀, m − 1 − k₁, 0), −B(n₀, k₁, 1) Output4 B(n₀,k₁, 0), −B(n₀, m − 1 − k₁, 1) P.S.: n₀ = 0~c − 1, n₁ = 0~m − 1, k₁ =0~(m + 1)/2 when m is an odd number, and k₁ − 0~m/2 when m is an evennumber.

As cited above, the recursive type-III discrete cosine/sine transformdevice 120 is implemented by sharing the hardware, and the computationalperiod includes m×(m+1)/2 cycles.

As shown in FIG. 9, the recursive type-II discrete cosine/sine transformdevice 120 includes first to sixth registers 901-906, first to fifthadders 931-935, a first 3-to-1 multiplexer 951, a second 3-to-1multiplexer 952, a first multiplier 971, a second multiplier 972, and afourth multiplier 974.

Upon FIG. 2, the DCT-II/DST-II hardware architecture is designed. Thedesign is focused on how to provide a recursive DCT-II/DST-II hardwarearchitecture with a low operational period to thereby improve theslow-speed recursive architecture in the prior art. In addition, on thehardware design, the sharing scheme is expected, which can allow thedesigned hardware to concurrently have the DCT-II and DST-II operationalcapabilities to thereby reduce the hardware cost.

Equation (54) and Equation (55) are defined as m-point DCT-II and DST-IImath models respectively. For input signals p[n₀] and q[n₀] and outputsignals P_(DCT-II)[k₀] and Q_(DCT-II)[k₀], where n₀=0˜c−1, k₀=0˜c−1,

$\begin{matrix}{{P_{{DCT}\text{-}{II}}\left\lbrack k_{0} \right\rbrack} = {\sum\limits_{n_{0} = 0}^{c - 1}\;{{p\left\lbrack n_{0} \right\rbrack} \times {{\cos\left( \frac{\left( {{2n_{0}} + 1} \right)k_{0}\pi}{2c} \right)}.}}}} & (54) \\{{Q_{{DST}\text{-}{II}}\left\lbrack k_{0} \right\rbrack} - {\sum\limits_{n_{0} = 0}^{c - 1}\;{{q\left\lbrack n_{0} \right\rbrack} \times {{\sin\left( \frac{\left( {{2n_{0}} + 1} \right)k_{0}\pi}{2c} \right)}.}}}} & (55)\end{matrix}$

For different applications, the c-point number can be odd or even, whichis separately discussed as follows.

(A) c is an Odd Number

If c is an odd number, Equation (54) can be rewritten as Equation (56),Equation (57):

$\begin{matrix}{{{P_{{DCT}\text{-}{II}}\lbrack 0\rbrack} = {{p^{(1)}\lbrack 0\rbrack} + {p^{(1)}\lbrack 1\rbrack} +}},\ldots\mspace{14mu},{{+ {p^{(1)}\left\lbrack {{\left( {c - 1} \right)/2} - 1} \right\rbrack}} + {p\left\lbrack \frac{c - 1}{2} \right\rbrack}},} & (56) \\{{{P_{{DCT}\text{-}{II}}\left\lbrack k_{0} \right\rbrack} = {{\sum\limits_{n_{0} = 0}^{{{({c - 1})}/2} - 1}\;{{p^{(1)}\left\lbrack n_{0} \right\rbrack} \times {\cos\left( \frac{\left( {{2n_{0}} + 1} \right)k\;\pi}{2c} \right)}}} + {{tmp}\; 0 \times {p\left\lbrack \frac{c - 1}{2} \right\rbrack}}}},} & (57) \\{\mspace{79mu}{{{{where}\mspace{14mu}{p^{(1)}\left\lbrack n_{0} \right\rbrack}} - {p\left\lbrack n_{0} \right\rbrack} + {\left( {- 1} \right)^{k_{0}}{p\left\lbrack {c - 1 - n_{0}} \right\rbrack}}},\mspace{79mu}{{{and}\mspace{14mu}{tmp}\; 0} = \left\{ \begin{matrix}{1,} & {{{mod}\left( {k_{0},4} \right)} = 0} \\{{- 1},} & {{{mod}\left( {k_{0},4} \right)} = 2} \\{0,} & {{{mod}\left( {k_{0},4} \right)} = {1\mspace{14mu}{or}\mspace{14mu} 3.}}\end{matrix} \right.}}} & (58)\end{matrix}$

From equation (56), equation (57), it is known that the input data p[n₀]in equation (54) is operated with equation (58) to produce p⁽¹⁾[n₀],which has a half of data amount than the original, so the computationalperiod required for DCT-II includes (c−1)/2×c cycles only, but theoperation in equation (56) requires using additional adders andregisters, as shown in FIG. 10 in which is a schematic diagram of thehardware architecture corresponding to equation (56) according to anembodiment of the embodiment. In this case, when the operation inequation (56) is implemented with a common hardware, as shown in FIG. 12in which is a schematic diagram of the DCT-II/DST-II hardwarearchitecture according to an embodiment of the embodiment, thecomputational period to be required for DCT-II includes (c+1)/2×ccycles.

Similarly equation (55) can be rewritten as equation (59), equation(60), and equation (61) as follows:

$\begin{matrix}{\mspace{79mu}{{{Q_{{DST}\text{-}{II}}\lbrack 0\rbrack} = 0},}} & (59) \\{{{Q_{{DST}\text{-}{II}}\left\lbrack k_{0} \right\rbrack} = {{\sum\limits_{n_{0} = 0}^{{{({c - 1})}/2} - 1}\;{{q^{(1)}\left\lbrack n_{0} \right\rbrack} \times {\sin\left( \frac{\left( {{2n_{0}} + 1} \right)k_{0}\;\pi}{2c} \right)}}} + {{tmp}\; 1 \times {q\left\lbrack \frac{c - 1}{2} \right\rbrack}}}},} & (60) \\{\mspace{79mu}{{{{where}\mspace{14mu}{q^{(1)}\left\lbrack n_{0} \right\rbrack}} = {{q\left\lbrack n_{0} \right\rbrack} + {\left( {- 1} \right)^{k_{0} + 1}{q\left\lbrack {c - 1 - n_{0}} \right\rbrack}}}},\mspace{79mu}{{{and}\mspace{14mu}{tmp}\; 1} = \left\{ \begin{matrix}{1,} & {{{mod}\left( {k_{0},4} \right)} = 1} \\{{- 1},} & {{{mod}\left( {k_{0},4} \right)} = 3} \\{0,} & {{{mod}\left( {k_{0},4} \right)} = {0\mspace{14mu}{or}\mspace{14mu} 2.}}\end{matrix} \right.}}} & (61)\end{matrix}$

From equation (59), equation (60), it is known that the input data q[n₀]in equation (55) is operated with equation (61) to thereby produceq⁽¹⁾[n₀], which has a half of data amount than the original, so thecomputational period required for DST-II includes (c−1)/2×c cycles only.

Next, the cosine function is derived from the sine function in theDST-II method, so the cosine coefficients in the DCT-II method can beshared in the hardware implementation as follows.

$\begin{matrix}{{{Q_{{DST}\text{-}{II}}\left\lbrack {c - k_{0}} \right\rbrack} = {{\sum\limits_{n_{0} = 0}^{{{({c - 1})}/2} - 1}\;{\left( {- 1} \right)^{n_{0}} \times {q^{(2)}\left\lbrack n_{0} \right\rbrack} \times {\cos\left( \frac{\left( {{2n_{0}} + 1} \right)k_{0}\;\pi}{2c} \right)}}} + {{tmp}\; 1 \times {q\left\lbrack \frac{c - 1}{2} \right\rbrack}}}},} & (62) \\{\mspace{79mu}{{{where}\mspace{14mu}{q^{(2)}\left\lbrack n_{0} \right\rbrack}} = {{q\left\lbrack n_{0} \right\rbrack} + {\left( {- 1} \right)^{c - k_{0} + 1}{{q\left\lbrack {c - 1 - n_{0}} \right\rbrack}.}}}}} & (63)\end{matrix}$

From equation (62), equation (53), it is known that the cosinecoefficients for DCT-II can be shared by simply ordering the outputsignals and adjusting the positive and negative signs of the inputsignals for DST-II, to thereby produce the operational result for DST-IIand save the hardware cost.

(B) c is an Even Number

If c is an even number, equation (54) can be rewritten as equation (64),equation (65) as follows.

$\begin{matrix}{{{P_{{DCT}\text{-}{II}}\lbrack 0\rbrack} = {{p^{(1)}\lbrack 0\rbrack} + {p^{(1)}\lbrack 1\rbrack} +}},\ldots\mspace{14mu},{+ {p^{(1)}\left\lbrack {{c/2} - 1} \right\rbrack}},} & (64) \\{{P_{{DCT}\text{-}{II}}\left\lbrack k_{0} \right\rbrack} = {\sum\limits_{n_{0} = 0}^{{c/2} - 1}{{p^{(1)}\left\lbrack n_{0} \right\rbrack} \times {{\cos\left( \frac{\left( {{2n_{0}} + 1} \right)k_{0}\pi}{2c} \right)}.}}}} & (65)\end{matrix}$

From Equation (64), Equation (65), it is known that the input data p[n₀]in Equation (54) is operated with Equation (58) to produce p⁽¹⁾[n],which has a half of data amount than the original, so the computationalperiod required for DCT-II includes c/2×(c−1) cycles only, but theoperation in Equation (56) requires using additional adders andregisters, as shown in FIG. 11 in which is a schematic diagram of usingadditional adder and registers according to an embodiment of theembodiment. In this case, when the operation in Equation (56) isimplemented with a common hardware, as shown in FIG. 12, thecomputational period to be required for DCT-II includes c²/2 cycles.

Similarly, Equation (55) can be rewritten as Equation (66), Equation(67) as follows:

$\begin{matrix}{{Q_{{DST}\text{-}{II}}\lbrack 0\rbrack} = 0.} & (66) \\{{Q_{{DST}\text{-}{II}}\left\lbrack k_{0} \right\rbrack} = {\sum\limits_{n_{0} = 0}^{{c/2} - 1}{{q^{(1)}\left\lbrack n_{0} \right\rbrack} \times {{\sin\left( \frac{\left( {{2n_{0}} + 1} \right)k_{0}\pi}{2c} \right)}.}}}} & (67)\end{matrix}$

Next, the cosine function can be derived from the sine function in theDST-II method, so the cosine coefficients in the DCT-II method can beshared in the hardware implementation as follows:

$\begin{matrix}{{Q_{{DST}\text{-}{II}}\left\lbrack {c - k_{0}} \right\rbrack} = {\sum\limits_{n_{0} = 0}^{{c/2} - 1}{\left( {- 1} \right)^{n_{0}} \times {q^{(2)}\left\lbrack n_{0} \right\rbrack} \times {{\cos\left( \frac{\left( {{2n_{0}} + 1} \right)k_{0}\pi}{2c} \right)}.}}}} & (68)\end{matrix}$

From equation (66), it is known that the cosine coefficients for DCT-IIcan be shared by simply ordering the output signals and adjusting thepositive and negative signs of the input signals for DST-II to therebyproduce the operational result for DST-II and save the hardware cost.

Finally, upon the Chebyshev polynomials, the following equations can beobtained:

$\begin{matrix}{{\cos\left( \frac{\left( {{2n_{0}} + 1} \right)k_{0}\pi}{2c} \right)} = {{2 \times {\cos\left( \frac{k_{0}\pi}{c} \right)}{\cos\left( \frac{\left( {{2n_{0}} - 1} \right)k_{0}\pi}{2c} \right)}} - {{\cos\left( \frac{\left( {{2n_{0}} - 3} \right)k_{0}\pi}{2c} \right)}.}}} & (69)\end{matrix}$

The initial values, cos(k₀π/c), cos(k₀π/2c), cos(k₀(−1)π/2c)respectively for three cosine functions at the right side of theequality in Equation (69) in the invention can be obtained by pluggingn₀=1. Since cos(k_(n)(−1)π/2c)=cos(k₀π2c), the cosine coefficients witha same k1 and different n1 can be generated by the recursive operationin Equation (69), with accessing cos(k₀π/c) and cos(k₀π/2c), and thememory requirement of 2c words only.

Plugging equation (69) in equation (57), equation (62), equation (65),equation (68), the method with the low operation cycle andDCT-III/DST-III operational capabilities is obtained. In this case, thederived method requires a hardware architecture designed as that shownin FIG. 12.

In FIG. 12, p⁽¹⁾[n₀] indicates the input signals Input1 and Input2, andq⁽¹⁾[n₀] indicates the input signals Input3 and Input4. When c is an oddnumber, the output signals Output1 and Output2 correspond to the resultoperated in equation (57) or equation (56), the output signals Output3and Output4 correspond to the result operated in equation (62).

When c is an even number, the output signals Output1 and Output2correspond to the result operated in equation (64) and equation (65),the output signals Output3 and Output4 correspond to the result operatedin equation (68). The parameters corresponding to the output signals inFIG. 12 are shown in Table 5.4.1 as follows.

TABLE 5.4.1 DCT-II/DST-II hardware and method relative table HardwareI/O Corresponding Parameters Input1 Data after T_(c)(n₀, k₁, k₀)operates in equation (39) Input2 Data after T_(c)(n₀, m − k₁ − 1, k₀)operates in equation(39) Input3 Data after T_(s)(n₀, k₁, k₀) operates inequation (44) Input4 Data after T_(s)(n₀, m − k₁ − 1, k₀) operates inequation(44) Output1 X_(c)(k₀, k₁) Output2 X_(c)(k₀, m − 1 − k₁) Output3X_(s)(c − k₀, k₁) Output4 X_(s)(c − k₀, m − 1 − k₁) P.S.; k₀ = 0~c − 1,k₁ = 0~(m + 1)/2 when m is an odd number, k₁ = 0~m/2 when m is an evennumber.

As cited above and shown in FIG. 12, it is known that the recursivetype-II discrete cosine/sine transform device 140 can be implemented ina common hardware, and the computational period includes c×(c+1)/2cycles, the recursive type-II discrete cosine/sine transform device 140includes seventh to twelfth registers 1217-1222, sixth to tenth adders1236-1240, a third 3-to-1 multiplexer 1253, a fourth 3-to-1 multiplexer1254, a third multiplier 1273, a fifth multiplier 1275, a sixthmultiplier 1276, a seventh multiplier 1277, and an eighth multiplier1278.

As cited above, an M-point DCT-V operation in the invention is dividedinto an m-point DCT-III/DST-III operation and a c-point DCT-II/DST-IIoperation. Namely, the input signals pass through the first stage ofDCT-III/DST-III and sequentially the second stage of DCT-II/DST-II.However, the operation in the immediate stage of cosine and sine factorsis required before the signals input to the second stage. The hardwarearchitectures respectively for the first and the second stages areaforementioned, and the operation in the immediate stage of cosine andsine factors and corresponding hardware design are described in detailas follows.

FIG. 13 is a schematic diagram of the recursive type-IV discrete cosinetransform system 1300 according to another embodiment of the embodiment.In FIG. 13, the system 1300 includes a first permutation device 1310, amodified recursive type-III discrete cosine/sine transform device 1320,a recursive type-II discrete cosine/sine transform device 1330, and asecond permutation device 1340. The system 130 merges the immediatecosine and sine factor operations to the modified recursive type-IIIdiscrete cosine/sine transform device 1320.

The first permutation device 1310 receives N digital input signals andperforms a two-dimensional order permutation operation on the N digitalsignals for generating N two-dimensional first temporal signals, where Nis a positive integer.

The modified recursive type-III discrete cosine/sine transform device1320 is connected to the first permutation device 1310 and has a firstand a second operational modes such that in the first operational mode atype-II discrete cosine/sine transform is repeated c times on the Nfirst temporal signals for generating c second temporal signals eachwith m points, where N=m×c, and m, c are a positive integer.

The recursive type-II discrete cosine/sine transform device 1330 isconnected to the modified recursive type-III discrete cosine/sinetransform device 1320 and has a first and a second operational modessuch that in the first operational mode a third temporal signal isreceived and a type-II discrete cosine/sine transform is repeated mtimes on the third temporal signal for generating m fourth temporalsignals each with c points.

The second permutation device 1340 is connected to the recursive type-IIdiscrete cosine/sine transform device 1330 in order to receive thefourth temporal signals and perform a one-dimensional order permutationoperation on the fourth temporal signals for generating None-dimensional output signals, wherein the N one-dimensional outputsignals are obtained by performing a type-IV discrete cosine transformon the N digital input signals.

From equation (32) and equation (34), it is known that the result of thefirst stage of DCT-II and DST-III operations is multiplied by the cosineand sine factors defined as follows:

Cosine Factor:

$\begin{matrix}{{\cos\left( \frac{\left( {{2n_{0}} + 1} \right)\left( {{2k_{1}} + 1} \right)\pi}{4M} \right)}.} & (70)\end{matrix}$

Sine Factor:

$\begin{matrix}{{\sin\left( \frac{\left( {{2n_{0}} + 1} \right)\left( {{2k_{1}} + 1} \right)\pi}{4M} \right)}.} & (71)\end{matrix}$

From equation (70) and equation (71), it is seen that, with n₀=0˜c 1,k₁=0˜m−1, and M=m×c, the M-point DCT-IV requires M cosine factors and Msine factors, i.e., the memory capacity of 2M words is required foraccessing the cosine and sine factors. To reduce a size of memory, thecosine and sine factor generation device, i.e., a cosine and sinecoefficient generator, is designed in the invention.

First, since the first stage of hardware architecture generates twoDCT-II and two DST-III operational results every in cycles, as shown inTable 5.3.1. Thus, the results (data) are multiplied by thecorresponding cosine and sine factors defined in equation (72), suchthat the immediate stage of operations is complete as the four factorsare concurrently generated.

$\begin{matrix}\left\{ {{{\begin{matrix}{\cos\left( \frac{\left( {{2n_{0}} + 1} \right)\left( {{2k_{1}} + 1} \right)\pi}{4M} \right)} \\{\sin\left( \frac{\left( {{2n_{0}} + 1} \right)\left( {{2k_{1}} + 1} \right)\pi}{4M} \right)} \\{\cos\left( \frac{\left( {{2n_{0}} + 1} \right)\left( {{2\left( {m - k_{1} - 1} \right)} + 1} \right)\pi}{4M} \right)} \\{\sin\left( \frac{\left( {{2n_{0}} + 1} \right)\left( {{2\left( {m - k_{1} - 1} \right)} + 1} \right)\pi}{4M} \right)}\end{matrix}{for}\mspace{14mu} n_{0}} = {{\left. 0 \right.\sim c} - {1{if}\mspace{14mu} m\mspace{11mu}{is}\mspace{11mu}{even}}}},{{{for}\mspace{14mu} k_{1}} = {{{\left. 0 \right.\sim m}/2}{if}\mspace{14mu} m\mspace{14mu}{is}\mspace{14mu}{odd}}},{{{for}\mspace{14mu} k_{1}} = {{\left. 0 \right.\sim\left( {m + 1} \right)}/2.}}} \right. & (72)\end{matrix}$

It is known in FIG. 2 that the output order of the first stage of outputdata first sets the index k₁=0 and changes the index n₀ from zero toc−1, then sets k₁=1 and changes n₀ from zero to c−1, and finally repeatsto set k₁ to a fixed value and change n₀ from zero to c−1 until k₁=m−1,which indicates the first stage of output data is completely output.Upon such an architecture, DTPT=2, and the data corresponding to theindexes k₁ and m−1−k₁ can be concurrently generated, so the cosine andsine factor generation device in the invention can output thecoefficients meeting with the output order of the first stage of outputdata.

For a more clear derivation, some parameters in the invention aredefined as:

${\theta_{f} = \frac{\left( {{2k_{1}} + 1} \right)\pi}{4M}},{\theta_{b} = {\frac{\left( {{2\left( {m - k_{1} - 1} \right)} + 1} \right)\pi}{4M}.}}$

Upon the trigonometric functions' sum identities:

$\quad\left\{ \begin{matrix}{{\cos\left( {\alpha + \beta} \right)} = {{\cos\;\alpha \times \cos\;\beta} - {\sin\;\alpha \times \sin\;\beta}}} \\{{{\sin\left( {\alpha + \beta} \right)} = {{\sin\;\alpha \times \cos\;\beta} + {\cos\;\alpha \times \sin\;\beta}}},}\end{matrix} \right.$equation (72) can be derived to the recursion as follows:

$\begin{matrix}\left\{ \begin{matrix}{{\cos\left( {\left( {{2n_{0}} + 1} \right)\theta_{f}} \right)} = {{{\cos\left( {\left( {{2n_{0}} - 1} \right)\theta_{f}} \right)} \times {\cos\left( {2\theta_{f}} \right)}} - {{\sin\left( {\left( {{2n_{0}} - 1} \right)\theta_{f}} \right)} \times {\sin\left( {2\theta_{f}} \right)}}}} \\{{\sin\left( {\left( {{2n_{0}} + 1} \right)\theta_{f}} \right)} = {{{\sin\left( {\left( {{2n_{0}} - 1} \right)\theta_{f}} \right)} \times {\cos\left( {2\theta_{f}} \right)}} + {{\cos\left( {\left( {{2n_{0}} - 1} \right)\theta_{f}} \right)} \times {\sin\left( {2\theta_{f}} \right)}}}} \\{{\cos\left( {\left( {{2n_{0}} + 1} \right)\theta_{b}} \right)} = {{{\cos\left( {\left( {{2n_{0}} - 1} \right)\theta_{b}} \right)} \times {\cos\left( {2\theta_{b}} \right)}} - {{\sin\left( {\left( {{2n_{0}} - 1} \right)\theta_{b}} \right)} \times {\sin\left( {2\theta_{b}} \right)}}}} \\{{\sin\left( {\left( {{2n_{0}} + 1} \right)\theta_{b}} \right)} = {{{\sin\left( {\left( {{2n_{0}} - 1} \right)\theta_{b}} \right)} \times {\cos\left( {2\theta_{b}} \right)}} + {{\cos\left( {\left( {{2n_{0}} - 1} \right)\theta_{b}} \right)} \times {{\sin\left( {2\theta_{b}} \right)}.}}}}\end{matrix} \right. & (73)\end{matrix}$

From the recursion above, it is easy to discover that the initial valuescos(θ_(f)), sin(θ_(f)), cos(θ_(b)), sin(θ_(b)) and cos(2θ_(f)),sin(2θ_(f)), cos(2θ_(b)), sin(2θ_(b)) are required for completing theoperation. The number of initial values can influence the ROM size,i.e., the more the number of initial values is, the more the number ofwords required for ROM. For reducing the number of initial values, therecursion is derived as follows:

$\begin{matrix}{\;\left\{ \begin{matrix}{{\cos\left( {2\theta_{f}} \right)} = {{{\cos\left( \theta_{f} \right)} \times {\cos\left( \theta_{f} \right)}} - {{\sin\left( \theta_{f} \right)} \times {\sin\left( \theta_{f} \right)}}}} \\{{\sin\left( {2\theta_{f}} \right)} = {{{\sin\left( \theta_{f} \right)} \times {\cos\left( \theta_{f} \right)}} + {{\cos\left( \theta_{f} \right)} \times {\sin\left( \theta_{f} \right)}}}} \\{{\cos\left( {2\theta_{b}} \right)} = {{{\cos\left( \theta_{b} \right)} \times {\cos\left( \theta_{b} \right)}} - {{\sin\left( \theta_{b} \right)} \times {\sin\left( \theta_{b} \right)}}}} \\{{\sin\left( {2\theta_{b}} \right)} = {{{\sin\left( \theta_{b} \right)} \times {\cos\left( \theta_{b} \right)}} + {{\cos\left( \theta_{b} \right)} \times {{\sin\left( \theta_{b} \right)}.}}}}\end{matrix} \right.} & (74)\end{matrix}$where only the initial values cos(θ_(f)), sin(θ_(f)), cos(θ_(b)),sin(θ_(b)) are used to generate a same k₁ and different n_(o) for thecosine and sine factors since cos(2θ_(f)), sin(2θ_(f)), cos(2θ_(b)),sin(2θ_(b)) can be calculated in equation (74). Therefore, the recursiverelations are:

$\mspace{20mu}{n_{0} = {0\text{:}\mspace{14mu}\left\{ {\begin{matrix}{\cos\left( \theta_{f} \right)} \\{\sin\left( \theta_{f} \right)} \\{\cos\left( \theta_{b} \right)} \\{\sin\left( \theta_{b} \right)}\end{matrix},\mspace{20mu}{n_{0} = {1\text{:}\mspace{14mu}\left\{ {\begin{matrix}{{\cos\left( {3\theta_{f}} \right)} = {{{\cos\left( \theta_{f} \right)} \times {\cos\left( {2\theta_{f}} \right)}} - {{\sin\left( \theta_{f} \right)} \times {\sin\left( {2\theta_{f}} \right)}}}} \\{{\sin\left( {3\theta_{f}} \right)} = {{{\sin\left( \theta_{f} \right)} \times {\cos\left( {2\theta_{f}} \right)}} + {{\cos\left( \theta_{f} \right)} \times {\sin\left( {2\theta_{f}} \right)}}}} \\{{\cos\left( {3\theta_{b}} \right)} = {{{\cos\left( \theta_{b} \right)} \times {\cos\left( {2\theta_{b}} \right)}} - {{\sin\left( \theta_{b} \right)} \times {\sin\left( {2\theta_{b}} \right)}}}} \\{{\sin\left( {3\theta_{b}} \right)} = {{{\sin\left( \theta_{b} \right)} \times {\cos\left( {2\theta_{b}} \right)}} + {{\cos\left( \theta_{b} \right)} \times {\sin\left( {2\theta_{b}} \right)}}}}\end{matrix},\mspace{20mu}{n_{0} = {2\text{:}\mspace{14mu}\left\{ {\begin{matrix}{{\cos\left( {5\theta_{f}} \right)} = {{{\cos\left( {3\theta_{f}} \right)} \times {\cos\left( {2\theta_{f}} \right)}} - {{\sin\left( {3\theta_{f}} \right)} \times {\sin\left( {2\theta_{f}} \right)}}}} \\{{\sin\left( {5\theta_{f}} \right)} = {{{\sin\left( {3\theta_{f}} \right)} \times {\cos\left( {2\theta_{f}} \right)}} + {{\cos\left( {3\theta_{f}} \right)} \times {\sin\left( {2\theta_{f}} \right)}}}} \\{{\cos\left( {5\theta_{b}} \right)} = {{{\cos\left( {3\theta_{b}} \right)} \times {\cos\left( {2\theta_{b}} \right)}} - {{\sin\left( {3\theta_{b}} \right)} \times {\sin\left( {2\theta_{b}} \right)}}}} \\{{\sin\left( {5\theta_{b}} \right)} = {{{\sin\left( {3\theta_{b}} \right)} \times {\cos\left( {2\theta_{b}} \right)}} + {{\cos\left( {3\theta_{b}} \right)} \times {\sin\left( {2\theta_{b}} \right)}}}}\end{matrix},\mspace{20mu}{{\vdots n_{0}} = {c - {1\text{:}\mspace{14mu}\left\{ \begin{matrix}{{\cos\left( {\left( {{2c} - 1} \right)\theta_{f}} \right)} = {{{\cos\left( {\left( {{2c} - 3} \right)\theta_{f}} \right)} \times {\cos\left( {2\theta_{f}} \right)}} - {{\sin\left( {\left( {{2c} - 3} \right)\theta_{f}} \right)} \times {\sin\left( {2\theta_{f}} \right)}}}} \\{{\sin\left( {\left( {{2c} - 1} \right)\theta_{f}} \right)} = {{{\sin\left( {\left( {{2c} - 3} \right)\theta_{f}} \right)} \times {\cos\left( {2\theta_{f}} \right)}} + {{\cos\left( {\left( {{2c} - 3} \right)\theta_{f}} \right)} \times {\sin\left( {2\theta_{f}} \right)}}}} \\{{\cos\left( {\left( {{2c} - 1} \right)\theta_{b}} \right)} = {{{\cos\left( {\left( {{2c} - 3} \right)\theta_{b}} \right)} \times {\cos\left( {2\theta_{b}} \right)}} - {{\sin\left( {\left( {{2c} - 3} \right)\theta_{b}} \right)} \times {\sin\left( {2\theta_{b}} \right)}}}} \\{{\sin\left( {\left( {{2c} - 1} \right)\theta_{b}} \right)} = {{{\sin\left( {\left( {{2c} - 3} \right)\theta_{b}} \right)} \times {\cos\left( {2\theta_{b}} \right)}} + {{\cos\left( {\left( {{2c} - 3} \right)\theta_{b}} \right)} \times {{\sin\left( {2\theta_{b}} \right)}.}}}}\end{matrix} \right.}}}} \right.}}} \right.}}} \right.}}$

The hardware architecture can be implemented with reference to FIG. 14in which is a schematic diagram of a cosine/sine factor generationdevice according to the embodiment.

TABLE 5.5.1 Hardware estimation of cosine and sine factor generationdevice Multiplier Adder Rom 8 4 2m

Table 5.5.1 indicates the hardware estimation of cosine and sine factorgeneration device. It is known from Table 5.5.1 that the ROM size can bereduced from 2M to 2m, i.e., 1/c than the original, which is relativelyimproved in memory requirement, but the price is eight additionalmultipliers and four additional adders. To overcome this, the inventivearchitecture is further improved.

Cosine and sine factors' multiplication operation and data foldingprocess:

Upon equation (32) and equation (34), the results of a DCT-III andDST-III operation are multiplied by the cosine and sine factors, and theresults after the multiplication take an addition or subtractionoperation to one another. Next, it is known from equation (58) andequation (63) that the data is folded to reduce the data amount to ahalf and input to the second stage of DST-II/DST-II operations. Thecited above is the immediate stage of operations and generally dividedinto three steps as follows:

1. The input signals are multiplied by the cosine and sine factorsrespectively.

2. The signals multiplied by the cosine factor and by the sine factorare added or subtracted to one another.

3. The results after the operation in step (2) are folded.

The data after completing the immediate stage is stored in theregisters. Since the folding operation reduces the data amount to ahalf, only [c/2] records of data are required in access. In addition,the immediate stage of operations can update the data of the registers,and the second stage of operations needs to repeatedly provide theimmediate values c time to the registers, so that the data of theregisters cannot be updated continuously. In this case, the number ofregisters is additionally doubled. Accordingly, c registers are requiredfor the results of a folding operation. As to the hardware action of thefolding operation, an example of c as even numbers is described asfollows: generating c−1 data in step 2 and sequentially storing the 0-thto (c/2−1)-th records of data directly in the registers, as shown inFIG. 15 in which is a schematic diagram of the hardware action ofinputting the upper half data in the folding operation according to theembodiment. The (c/2)-th to (c−1)-th records of data are operated withthe data in the registers, and the results are stored back to theregisters, as shown in FIG. 16 in which is a schematic diagram of ahardware action of inputting lower half data in the folding operationaccording to the embodiment.

Next, plugging equation (12) and equation (14) in equation (58) andequation (63), the relation can be obtained as follows.T _(c)′(n ₀ ,k ₁,0)=T _(c)(n ₀ ,k ₁,0)+T _(c)(c−n ₀−1,k ₁,0),T _(c)′(n ₀ ,m−k ₁−1,0)=T _(c)(n ₀ ,m−k ₁−1,0)+T _(c)(c−n ₀−1,m−k₁−1,0),T _(c)′(n ₀ ,k ₁,1)=T _(c)(n ₀ ,k ₁,1)+T _(c)(c−n ₀−1,k ₁,1),T _(c)′(n ₀ ,m−k ₁−1,1)=T _(c)(n ₀ ,m−k ₁−1,1)+T _(c)(c−n ₀−1,m−k₁−1,1),T _(s)′(n ₀ ,k ₁,0)=T _(s)(n ₀ ,k ₁,0)+T _(s)(c−n ₀−1,k ₁,0),T _(s)′(n ₀ ,m−k ₁−1,0)=T _(s)(n ₀ ,m−k ₁−1,0)+T _(s)(c−n ₀−1,m−k₁−1,0),T _(s)′(n ₀ ,k ₁,1)=T _(s)(n ₀ ,k ₁,1)+T _(s)(c−n ₀−1,k ₁,1),T _(s)′(n ₀ ,m−k ₁−1,1)=T _(s)(n ₀ ,m−k ₁−1,1)+T _(s)(c−n ₀−1,m−k₁−1,1),  (75)where

-   -   if c is even, for n₀=0˜c/2−1    -   if c is odd, for n₀=0˜(c−1)/2−1    -   if m is even, for k₁=0˜m/2 1    -   if m is odd, for k₁=0˜(m−1)/2−1.

With reference to the relation above, 8c registers are totally required,and the corresponding hardware architectures are shown in FIG. 17(A) andFIG. 17(B). FIGS. 17(A) and 17(B) are schematic diagrams of a completeintermediate-stage operation architecture according to the embodiment,where

${\theta_{M\; 1} = \frac{\left( {{2n_{0}} + 1} \right)\left( {{2k_{1}} + 1} \right)\pi}{4M}},{\theta_{M\; 2} = {\frac{\left( {{2n_{0}} + 1} \right)\left( {{2\left( {m - k_{1} - 1} \right)} + 1} \right)\pi}{4M}.}}$

TABLE 5.5.2 Cosine and sine factors' multiplication operation and datafolding process Multiplier Adder Register 16 16 8c

Table 5.5.2 indicates the cosine and sine factors' multiplicationoperation and data folding process. It is known from Table 5.5.3 thatthe hardware cost for the immediate stage of operations is relativelyhigh. To overcome this, the architecture is further improved.

As cited, it is discovered that the hardware cost for the immediatestage of operations is relatively high, which requires 24 multipliersand 20 adders in total. It is also easy to see in Table 5.5.3 that themultipliers of the immediate stage occupy 75% of the entire architecturewhile the adders occupy 67%, such that a total of 32 multipliers and 30adders are required for the entire architecture, which is not expectedin the invention because, though the operational speed or bit rate ofthe recursive architecture is relatively increased, the price is thehuge hardware resources. Thus, reducing the hardware is further requiredfor reducing the negative effect of the method.

TABLE 5.5.3 Hardware resource analysis Multiplier Adder First-stage 3 5DCT-III/DST-III Second-stage 5 5 DCT-II/DST-II Immediate-stage 24 20operations Entire architecture 32 30

First, the feature of the immediate-stage operations is first observed,where the input data is the results obtained from the first-stageoperations, i.e., the immediate stage is operated only when the firststage generates the output data. As cited above, the first stagegenerates the output data every m cycles, such that the immediate stageis operated every m cycles. Upon the feature, the proposed solution usesthe first and second stages of hardware to support the immediate-stageoperations, and in this case the first and second stages of circuits arehalted to increase more operational time. Namely, after the first stagegenerates the output data every m cycles, the first and second stages ofcircuits are halted. The hardware action on halting is described asfollows:

1. The first halt cycle uses 3 multipliers, one adder in the firststage, and five multipliers, three adders in the second stage to therebycomplete the operations of the aforementioned cosine and sine factorgeneration device.

2. The second halt cycle uses four multipliers, five adders in the firststage, and four multipliers, three adders in the second stage to therebycomplete the operations of T_(c)′(n₀,k₁,0), T_(c)′(n₀,k₁,1),T_(s)′(n₀,k₁,0) T_(s)′(n₀,k₁,1) in FIGS. 17(A) and 17(B).

3. The third halt cycle uses four multipliers, five adders in the firststage and four multipliers, three adders in the second stage to therebycomplete the operations of T_(c)′(n₀,m−k₁−1,0), T_(c)′(n₀,m−k₁−1,1),T_(s)′(n₀,m−k₁−1,0), T_(s)′(n₀,m−k₁−1,1) in FIGS. 17(A) and 17(B).

FIG. 18 is a schematic diagram of the operations corresponding to thehalt cycles according to the embodiment. As shown in FIG. 18, with threehalt cycles, the hardware architecture in the first and the secondstages can be shared to thereby replace the 24 multipliers and 20 addersrequired for the immediate stage.

FIG. 19 is a schematic diagram of the modified recursive type-IIIdiscrete cosine/sine transform device according to the embodiment. FIG.20 is a schematic diagram of the recursive type-II discrete cosine/sinetransform device according to the embodiment. With reference to FIGS. 19and 20, each halt cycle uses the multipliers and adders in the first andthe second stages, and the corresponding hardware is shown in Table5.5.4.

TABLE 5.5.4 Corresponding hardware of each halt cycle Halt CycleMultiplier Number Adder Number 1 1.2.3.4.5.6.7.8 1.6.7.9 21.2.3.4.5.6.7.8 1.2.3.4.5.6.7.9 3 1.2.3.4.5.6.7.8 1.2.3.4.5.6.7.9

The concept of the common hardware is to provide different input signalsin hardware at different time points, and thus additional multiplexersare used to control the select lines of the added multiplexers tothereby select the different input data. Therefore, the purpose ofsharing the hardware is achieved. It is known in Table 5.5.5 that anumber of transistors of a multiplexer (MUX) are far fewer than that ofan adder and of a multiplier, so the effectiveness is very high whilethe multiplexers are used to reduce a number of used multipliers andadders,

TABLE 5.5.5 Transistor number of 24-bit component Component Latch AdderMultiplier Multiplexer Transistors 240 672 18624 192

In addition, the more the hardware is shared, the more the number ofadded multiplexers, but different hardware may have a same input signalcorresponding to a same multiplexer, i.e., the different hardware maycome from the same multiplexer. Such a multiplexer with the citedfeature is shown in FIG. 20. For example, the multipliers with Number 1and Number 5 commonly use the multiplexer which selects cos(θ_(f)) orcos(2θ_(f)) as an output. Thus, the input signal of the multipliers withNumber 1 and Number 5 only requires sharing the multiplexer withcom_mux1 shown in FIG. 21.

As shown in FIG. 19, when the recursive type-III discrete cosine/sinetransform device 1320 and the recursive type-II discrete cosine/sinetransform device 1330 are in the second operational mode, the devices1320 and 1330 sequentially perform cosine/sine factor multiplication andcorresponding addition operations on the c in-point second temporalsignals for generating the c third temporal signals each with m pointssequentially.

As shown in FIG. 19, the recursive type-III discrete cosine/sinetransform device 1320 includes first to sixth registers 1911-1916, firstto fifth adders 1931-1935, a 3-to-1 multiplexer 1951, a second 3-to-1multiplexer 1952, a first multiplier 1971, a second multiplier 1972, anda fourth multiplier 1974.

As shown in FIG. 20, the recursive type-II discrete cosine/sinetransform device 1330 includes seventh to twelfth registers 2017-2020,sixth to tenth adders 2036-2040, a third 3-to-1 multiplexer 2051, afourth 3-to-1 multiplexer 2052, a third multiplier 2073, a fifthmultiplier 2075, a sixth multiplier 2076, a seventh multiplier 2077, andan eighth multiplier 2078.

FIG. 21 is a schematic diagram of using common multiplexers according tothe embodiment. In FIG. 21, the output signal of a common or sharedmultiplexer is output to a multiplexer in front of the multipliers.FIGS. 22(A) and 22(B) are schematic diagrams of using common multipliersand adders according to an embodiment of the embodiment. As shown inFIGS. 22(A) and 22(B), multiplier numbering and adder numbering areshown in FIGS. 19 and 20, where a same number indicates the samemultiplier or adder. In addition, FirstStage_node in FIGS. 22(A) and22(B) indicates the original input signals of the multipliers or addersin the first stage of circuit at the DCT-III/DST-III operations,SecondStage_node indicates the original input signals of the multipliersor adders in the second stage of circuit at the DCT-II/DST-IIoperations, and FirstStage_out1, FirstStage_out2, FirstStage_out3,FirstStage_out4 indicate the DCT-III/DST-III output signals after thefirst-stage operation, as shown in Table 5.3.1.

As shown in FIGS. 22(A) and 22(B), in a non-halt cycle, every circuitoutputs the original signals of the DCT-III/DST-III in the first stageor DCT-II/DST-II in the second stage. In a halt cycle, all hardwaresupports the immediate stage of operations, and the output signals ofeach circuit are shown in Table 5.5.6. The common hardware schemeproposed in the invention can effectively reduce the hardwarerequirement, and a comparison is shown in Table 5.5.7 in which thenumber of multipliers and the number of adders are used with and withoutthe common hardware design.

TABLE 5.5.6 Outputs of common hardware in different halt cycles FirstHalt Second Halt Third Halt Cycle Cycle Cycle Output1 Cos2θ_(f)T_(c)(n₀, k₁, 0) T_(s)(n₀, m − Cos(2n₀ + 1)θ_(f) 1 − k₁, 1) Output2Cos2θ_(b) T_(s)(n₀, k₁, 1) T_(c)(n₀, m − Cos(2n₀ + 1)θ_(b) 1 − k₁, 0)Output3 Sin2θ_(f) T_(s)(n₀, k₁, 0) T_(c)(n₀, m − Sin(2n₀ + 1)θ_(f) 1 −k₁, 1) Output4 Sin2θ_(b) T_(s)(n₀, k₁, 1) T_(s)(n₀, m − Sin(2n₀ +1)θ_(b) 1 − k₁, 0) Output5 FirstStage_node T_(c)′(n₀, k₁, 0) T_(c)′(n₀,m − 1 − k₁, 0) Output6 FirstStage_node T_(s)′(n₀, k₁, 1) T_(s)′(n₀, m −1 − k₁, 1) Output7 FirstStage_node T_(c)′(n₀, k₁, 1) T_(c)′(n₀, m − 1 −k₁, 1) Output8 FirstStage_node T_(s)′(n₀, k₁, 0) T_(s)′(n₀, m − 1 − k₁,0)

TABLE 5.5.7 Multiplier and adder number Multiplier Adder Non-CommonHardware Design 32 30 Common Hardware Design 8 10

Hardware Action and Cycle Number Estimation:

For implementing the proposed method in the invention, it is known ascited that the input data is pre-processed and sequentially input to thefirst stage of hardware architecture to operate. The first-stagehardware performs m-point DCT-III/DST-III operations and, upon theimproved method and architecture, every m cycles can generate tworecords of data in transformation. The data generated in every m cyclespasses through c cycles in the first stage to produce two sets ofc-point data, and accordingly M-point outputs are generated forcompleting all data operations in the first-stage architecture. In thiscase, referring again to FIG. 2, the number of cycles is shown asfollows.m×c×[m/2].  (76)

As cited, it is known that the c-point DCT-II/DST-II operations areperformed by the second-stage hardware and, upon the improved method andarchitecture, every [c/2] cycles can generate two records of data intransformation. Accordingly, M-point outputs are generated forcompleting all data operations in the second-stage architecture. In thiscase, referring again to FIG. 2, the number of cycles is shown asfollows.[c/2]×c×[m/2].  (77)

The invention uses a pipelined architecture to implement the requiredhardware in which the first stage generates the c-point data. Thec-point data is operated with the immediate-stage cosine/sine factors tothereby introduce the data into the second stage. By pipelining, thefirst stage and the second stage of circuits can be concurrentlyoperated as shown in FIG. 6, and the operational periods of the entirearchitecture are shown in Table 5.6.1 as follows.

TABLE 5.6.1 Cycles of M-point DCT-IV operation (no common hardware) m cEven Odd Even ${m \times c \times \frac{m}{2}} + {\frac{c}{2} \times c}$${m \times c \times \frac{m + 1}{2}} + {\frac{c}{2} \times c}$ Odd${m \times c \times \frac{m}{2}} + {\frac{c + 1}{2} \times c}$${m \times c \times \frac{m + 1}{2}} + {\frac{c + 1}{2} \times c}$

In the invention, the cosine and sine factor accesses require anoverlarge memory, so that the circuits used for the factor generationdevice relatively reduce the ROM size and additionally increase themultipliers and adders. In addition, since the immediate-stageoperations also require a lot of hardware, the first stage and thesecond stage are re-designed to share the hardware to thereby reduce thenumber of adders and multipliers, as shown in Table 5.5.6. However, theoperational period is slightly increased due to the common hardware, asshown in Table 5.6.2.

TABLE 5.6.2 M-point DCT-IV operational period (with commonhardware) m cEven Odd Even${\left( {m + 3} \right) \times c \times \frac{m}{2}} + {\frac{c}{2} \times c}$${\left( {m + 3} \right) \times c \times \frac{m + 1}{2}} + {\frac{c}{2} \times c}$Odd${\left( {m + 3} \right) \times c \times \frac{m}{2}} + {\frac{c + 1}{2} \times c}$${\left( {m + 3} \right) \times c \times \frac{m + 1}{2}} + {\frac{c + 1}{2} \times c}$

-   -   From Table 5.6.2, it is seen that the whole operational period        is dependent of m and c values. There are many types for        dividing an M-point DCT-IV into m-point and c-point        combinations, but the main point worth discussing is which type        of m-point and c-point combination can make the entire hardware        to have the highest performance and the smallest operational        period.

First, the processing speed of the first stage in the pipelined hardwarearchitecture cannot be greater than that of the second stage, otherwisethe following stages cannot process the data output by the previousstage in real-time so as not to operate the entire architecturesmoothly. For optimally operating the pipelined architecture, the firststage and the second stage needs to have a same operational period, andthe number of points to be divided, i.e., m and c, can influence theoperational period of the first stage and of the second stage. Next, itis known from Equation (76) and Equation (77) that the second-stageoperational period is half the first-stage operational period. For anexample of m, c as an even, when the first stage and second stage have asame operational period, the equality is derived from Equation (76) andEquation (77) as follows.

$\begin{matrix}{m = {\frac{c}{2}.}} & (78)\end{matrix}$

Equation (78) indicates that such an architecture has the highestperformance when the number of points in the second stage is double thatin the first stage. Thus, the number of points in this architecture isdistributed to the second stage greater than the first stage as far aspossible, but in cannot be small than c/2 to avoid that the second-stageoperational period is greater than the first-stage operational period.

The proposed method and architecture is described. For an example of m,c as an even, as compared with N2/2 cycles required for the typicalrecursive architecture, it is known in Table 5.6.2 that the number ofcycles required for the inventive design is:

$\begin{matrix}{{{\left( {m + 3} \right) \times c \times \frac{m}{2}} + {\frac{c}{2} \times c}} = {{\left( {m + 3} \right) \times \frac{N}{4}} + {\frac{c^{2}}{2}.}}} & (79)\end{matrix}$

As cited, the kernel hardware for the recursive type-III discretecosine/sine transform device 120 and the recursive type-II discretecosine/sine transform device 140 in the invention can support theDCT-IV/DCT-II/DCTIII/DST-II/DST-III operations concurrently and mergethe pre- and post-processing operations for the first permutation device110 and the second permutation device 150 to implement theIMDCT/MDCT/AQMF/SQMF operations to thereby gain the co-architecturedesign of analysis and synthesis filter-banks. Therefore, theoperational period is relatively improved, as compared other recursivealgorithms.

Although the present invention has been explained in relation to itspreferred embodiment, it is to be understood that many other possiblemodifications and variations can be made without departing from thespirit and scope of the invention as hereinafter claimed.

What is claimed is:
 1. A recursive type-IV discrete cosine transformsystem, comprising: a first permutation device for receiving N digitalinput signals and performing a two-dimensional order permutationoperation on the N digital signals to generate N two-dimensional firsttemporal signals, where N is a positive integer; a recursive type-IIIdiscrete cosine/sine transform device which is an m-point recursivetype-III discrete cosine/sine transform device connected to the firstpermutation device for receiving the N first temporal signals andrepeating a type-III discrete cosine/sine transform c times on the Nfirst temporal signals to generate c second temporal signals each with mpoints, where N=m×c, and m, c are each a positive integer; a cosine/sinefactor generation device connected to the recursive type-III discretecosine/sine transform device for sequentially performing cosine/sinefactor multiplication and corresponding addition operations on them-point second temporal signals to generate c third temporal signalseach with m points; a recursive type-II discrete cosine/sine transformdevice which is a c-point recursive type-II discrete cosine/sinetransform device connected to the cosine/sine factor generation devicefor receiving the third temporal signals and repeating a type-IIdiscrete cosine/sine transform m times to generate m fourth temporalsignals each with c points; and a second permutation device connected tothe recursive type-II discrete cosine/sine transform device forreceiving the fourth temporal signals and performing a one-dimensionalorder permutation operation on the fourth temporal signals forgenerating N one-dimensional output signals, wherein the None-dimensional output signals are obtained by performing a type-IVdiscrete cosine transform on the N digital input signals.
 2. Therecursive type-IV discrete cosine transform system as claimed in claim1, wherein the recursive type-III discrete cosine/sine transform deviceis implemented in a common hardware architecture.
 3. The recursivetype-IV discrete cosine transform system as claimed in claim 2, whereina computational period of the recursive type-III discrete cosine/sinetransform device comprises m×(m+1)/2 cycles.
 4. The recursive type-IVdiscrete cosine transform system as claimed in claim 3, wherein therecursive type-II discrete cosine/sine transform device is implementedin a common hardware architecture.
 5. The recursive type-IV discretecosine transform system as claimed in claim 4, wherein a computationalperiod of the recursive type-II discrete cosine/sine transform devicecomprises c×(c+1)/2 cycles.
 6. The recursive type-IV discrete cosinetransform system as claimed in claim 5, wherein the recursive type-IIIdiscrete cosine/sine transform device comprises first to sixthregisters, first to fifth adders, a first 3-to-1 multiplexer, a second3-to-1 multiplexer, a first multiplier, a second multiplier, and afourth multiplier.
 7. The recursive type-IV discrete cosine transformsystem as claimed in claim 6, wherein the recursive type-II discretecosine/sine transform device comprises seventh to twelfth registers,sixth to tenth adders, a third 3-to-1 multiplexer, a fourth 3-to-1multiplexer, a third multiplier, a fifth multiplier, a sixth multiplier,a seventh multiplier, and an eighth multiplier.
 8. A recursive type-IVdiscrete cosine transform system, comprising: a first permutation devicefor receives N digital input signals and performing a two-dimensionalorder permutation operation on the N digital signals to generate Ntwo-dimensional first temporal signals, where N is a positive integer; amodified recursive type-III discrete cosine/sine transform deviceconnected to the first permutation device and having a first and asecond operational modes such that in the first operational mode atype-III discrete cosine/sine transform is repeated c times on the Nfirst temporal signals for generating c second temporal signals eachwith m points, where N=m×c, and m, c are each a positive integer; arecursive type-II discrete cosine/sine transform device connected to themodified recursive type-III discrete cosine/sine transform device andhaving a first and a second operational modes such that in the firstoperational mode a third temporal signal is received and a type-IIdiscrete cosine/sine transform is repeated m times on the third temporalsignal for generating m fourth temporal signals each with c points; anda second permutation device connected to the recursive type-II discretecosine/sine transform device for receiving the fourth temporal signalsand performing a one-dimensional order permutation operation on thefourth temporal signals to generate N one-dimensional output signals,wherein the N one-dimensional output signals are obtained by performinga type-IV discrete cosine transform on the N digital input signals. 9.The recursive type-IV discrete cosine transform system as claimed inclaim 8, wherein, in the second operational mode, the recursive type-IIIdiscrete cosine/sine transform device and the recursive type-II discretecosine/sine transform device sequentially perform cosine/sine factormultiplication and corresponding addition operations on the c secondtemporal signals for generating c third temporal signals each with mpoints.
 10. The recursive type-IV discrete cosine transform system asclaimed in claim 9, wherein the recursive type-III discrete cosine/sinetransform device comprises first to sixth registers, first to fifthadders, a first 3-to-1 multiplexer, a second 3-to-1 multiplexer, a firstmultiplier, a second multiplier, and a fourth multiplier.
 11. Therecursive type-IV discrete cosine transform system as claimed in claim10, wherein the recursive type-II discrete cosine/sine transform devicecomprises seventh to twelfth registers, sixth to tenth adders, a third3-to-1 multiplexer, a fourth 3-to-1 multiplexer, a third multiplier, afifth multiplier, a sixth multiplier, a seventh multiplier, and aneighth multiplier.